Make data device resident in GPNORM_TRANS#416
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends GPNORM_TRANS to better support GPU execution by keeping key working data resident on the device across the routine, and by enabling GPU-aware point-to-point communications (with optional raw MPI support). It also updates the public interface so callers can indicate when PGP is already device-resident.
Changes:
- Added optional
LPGP_ON_GPUargument toGPNORM_TRANS(interface + CPU/GPU implementations). - Refactored the GPU implementation to maintain device-resident temporaries across the workflow and integrate GPU-aware send/recv paths.
- Kept CPU implementation interface-compatible while leaving behavior unchanged.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/trans/include/ectrans/gpnorm_trans.h |
Extends the public interface to include optional LPGP_ON_GPU. |
src/trans/gpu/external/gpnorm_trans.F90 |
Major GPU-path refactor: persistent device data usage + GPU-aware comms + optional raw MPI handling. |
src/trans/cpu/external/gpnorm_trans.F90 |
Adds the optional argument for interface compatibility (unused on CPU). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I suggest we broaden the scope slightly and call it "tidy of GPNORM_TRANS" on GPU, and also delete gpnorm_trans_gpu.F90. The latter has never been functional and will be redundant once this is merged. |
|
I added a test suite for |
|
Failure on LUMI. Is this directive correct @awnawab? gpnorm_trans.F90:224-228 Arrays |
I suspect it's the present map modifier. Looking back at it, the only function that serves is a safety check to abort if that symbol isn't already present, and of course it should throw an error for variables yet to be allocated on device. I think removing it is the right fix. Thanks a lot for adding the test! Otherwise this would have gone uncaught. |
8a91cfb to
5fd52e7
Compare
|
We have now an error on LUMI only affecting mpi2 tests, which is not at all obvious. I'll have to reproduce it and interactively debug it on LUMI. |
|
Considering the actual use case for this is in ifs coupled builds, and there's no pressing need to do that on lumi, we can exclude these 4 tests on lumi, leave an issue and then I'm happy to try to debug it in a non time pressing manner. |
5fd52e7 to
13a36dd
Compare
|
Looks like you spotted a bug @awnawab ...? Tests passing on LUMI now. |
|
Yeah, I'd misunderstood the difference between |
|
Then I'll take a proper look at gpu/external/gpnorm_trans.F90 and give a final review. |
| LLPGP_ON_GPU = .FALSE. | ||
| IF (PRESENT(LPGP_ON_GPU)) LLPGP_ON_GPU = LPGP_ON_GPU | ||
|
|
||
| IF (.NOT. LLPGP_ON_GPU) THEN | ||
| IUBOUND(1:3)=UBOUND(PGP) | ||
| IF(IUBOUND(1) < NPROMA) THEN | ||
| WRITE(NOUT,*)'GPNORM_TRANS:FIRST DIM. OF PGP TOO SMALL ',IUBOUND(1),NPROMA | ||
| CALL ABORT_TRANS('GPNORM_TRANS:FIRST DIMENSION OF PGP TOO SMALL ') | ||
| ENDIF | ||
| IF(IUBOUND(2) < KFIELDS) THEN | ||
| WRITE(NOUT,*)'GPNORM_TRANS:SEC. DIM. OF PGP TOO SMALL ',IUBOUND(2),KFIELDS | ||
| CALL ABORT_TRANS('GPNORM_TRANS:SECOND DIMENSION OF PGP TOO SMALL ') | ||
| ENDIF | ||
| IF(IUBOUND(3) < NGPBLKS) THEN | ||
| WRITE(NOUT,*)'GPNORM_TRANS:THIRD DIM. OF PGP TOO SMALL ',IUBOUND(3),NGPBLKS | ||
| CALL ABORT_TRANS('GPNORM_TRANS:THIRD DIMENSION OF PGP TOO SMALL ') | ||
| ENDIF | ||
| ENDIF |
There was a problem hiding this comment.
No I wouldn't think so. In the forecast timestep hotpath, running those consistency will incur an overhead, and it feels like overkill. However outside the hotpath, where we also call GPNORM_TRANS, the data in those cases is on host anyway, and there's little overhead for those checks.
This PR makes data device resident throughout the execution of
GPNORM_TRANS, including the addition of gpu aware comms.