Using PSyclone
Overview
This section contains step-by-step instructions that demonstrate an application of the
PSyclone-based source-code processing option available in the NEMO build system.
It assumes that PSyclone has been correctly installed as detailed in the NEMO
installation guide and that an appropriate arch file has been generated
(hereafter referred to as arch-auto.fcm
).
This initial application is simply a PSyclone passthrough of a NEMO configuration. That is, the source code is processed through PSyclone but the resultant code is functionally equivalent to the original source (although standardisation of some F90 constructs will be carried out). Later additions to this guide will illustrate how to add transformation scripts to this process to perform complex tasks such as identifying computational kernels and inserting OpenACC directives for GPU offloading.
PSyclone processing of the NEMO source code is available as an option in the NEMO build system, and PSyclone transformations can be enabled via option:
-p <PSyclone processing option>
of the ./makenemo
command (-p all
lists the available PSyclone transformations).
The different options correspond to transformation scripts in directory sct/
with the
exception of passthrough
(which is supported internally). This initial part of
PSyclone-processing guide demonstrates the use of the passthrough
option from scratch,
including the compilation of a model configuration with and without PSyclone passthrough,
running of the model, and verification of the passthrough.
Compilation and running of the BENCH test case
Step 1 - Compile the BENCH test case with and without PSyclone passthrough
At the top level of the NEMO repository,
$ cd nemo
a reference configuration of the BENCH test case can be compiled:
$ ./makenemo -m auto -a BENCH -n BENCH_0 -j 8 -v 1
Next, a corresponding configuration with PSyclone passthrough can be built:
$ ./makenemo -m auto -a BENCH -n BENCH_PT -j 8 -v 1 -p passthrough
For demonstration purposes, a simple, non-invasive PSyclone transformation script can alternatively be enabled:
$ ./makenemo -m auto -a BENCH -n BENCH_INFO -j 8 -v 1 -p list_symbols
This variant produces the same source code as the passthrough, but during the build process it outputs the names of all variables defined in the majority of the Fortran modules and in the associated module procedures.
Step 2 - Prepare a suitable configuration and submission script
For testing, a BENCH configuration of a small domain (ORCA2 equivalent) and a low number of time steps (80) can be generated as:
$ sed -e 's/nn_itend.*/nn_itend=80/' -e 's/nn_isize.*/nn_isize=180/' -e 's/nn_jsize.*/nn_jsize=148/' -e 's/nn_ksize.*/nn_ksize=31/' -e 's/ln_timing.*/ln_timing=.true./' -e '/\&namctl/asn_cfctl%l_runstat=.true.' tests/BENCH/EXPREF/namelist_cfg_orca1_like > ./namelist_cfg
Step 3 - Prepare a submission script (if required)
In principle, inside the experiment directories tests/BENCH_{0,PT,INFO}/EXP00/
it
would suffice to run the model as mpirun -n 4 ./nemo
or similarly, but assuming that
NEMO runs are typically submitted on a HPC system via a job scheduler, script
./submit.sh
will be assumed to contain the necessary system-specific settings and
commands (in effect starting the executable ./nemo
in an MPI environment) in the next
step:
$ vi submit.sh; chmod u+x ./submit.sh
Step 4 - Run NEMO BENCH_0
and BENCH_PT
Next, the two model runs can be started:
$ cp namelist_cfg submit.sh tests/BENCH_0/EXP00/
$ cd tests/BENCH_0/EXP00/
$ <job submission command> ./submit.sh
$ cd -
$ cp namelist_cfg submit.sh tests/BENCH_PT/EXP00/
$ cd tests/BENCH_PT/EXP00/
$ <job submission command> ./submit.sh
$ cd -
Verification and source-code inspection
Step 5 - Verify the PSyclone passthrough
Once the runs have finished, comparison of model output from the model builds with and without PSyclone passthrough,
$ vimdiff tests/BENCH_{0,PT}/EXP00/run.stat
should (hopefully) reveal identical results.
Step 6 - Inspect the source code for the effect of the PSyclone passthrough
With PSyclone processing, the build system processes the NEMO source code in three stages
(or with AGRIF in four stages): CPP preprocessing, PSyclone processing, and the actual
compilation. For the example of the BENCH_PT configuration, the original source-code
files are linked to in directory tests/BENCH_PT/WORK
, the CPP preprocessed
versions can be found in directory tests/BENCH_ST/BLD_SCT_PSYCLONE/ppsrc/nemo/
, and
the PSyclone processed files supplied to the Fortran compiler are at
tests/BENCH_ST/BLD_SCT_PSYCLONE/obj/
. For example, differences between the three
source-code variants for module usrdef_sbc
can be visualised with:
$ vimdiff tests/BENCH_PT/WORK/usrdef_sbc.F90 tests/BENCH_PT/BLD_SCT_PSYCLONE/{ppsrc/nemo,obj}/usrdef_sbc.f90
The substantial transformation of the original WHERE
construct starting at line 178 of
the original file in this example demonstrates the normalisation aspect of the PSyclone
processing stage. The various build stages are illustrated by this example which takes the
code block from its original:
DO jl = 1, jpl
WHERE ( phs(A2D(0),jl) <= 0._wp .AND. phi(A2D(0),jl) < 0.1_wp ) ! linear decrease from hi=0 to 10cm
qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ( ztri(:,:) + ( 1._wp - ztri(:,:) ) * ( 1._wp - phi(A2D(0),jl) * 10._wp ) )
ELSEWHERE( phs(A2D(0),jl) <= 0._wp .AND. phi(A2D(0),jl) >= 0.1_wp ) ! constant (ztri) when hi>10cm
qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ztri(:,:)
ELSEWHERE ! zero when hs>0
qtr_ice_top(:,:,jl) = 0._wp
END WHERE
ENDDO
through normal CPP macro expansion, to:
DO jl = 1, jpl
WHERE ( phs(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <= 0._wp .AND. phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) < 0.1_wp ) ! linear decrease from hi=0 to 10cm
qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ( ztri(:,:) + ( 1._wp - ztri(:,:) ) * ( 1._wp - phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) * 10._wp ) )
ELSEWHERE( phs(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <= 0._wp .AND. phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) >= 0.1_wp ) ! constant (ztri) when hi>10cm
qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ztri(:,:)
ELSEWHERE ! zero when hs>0
qtr_ice_top(:,:,jl) = 0._wp
END WHERE
ENDDO
and PSyclone transformation to:
do jl = 1, jpl, 1
do widx2 = 1, Nje0 + 0 - (Njs0 - 0) + 1, 1
do widx1 = 1, Nie0 + 0 - (Nis0 - 0) + 1, 1
if (phs(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <= 0._wp .AND. &
&phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) < 0.1_wp) then
qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = &
& qsr_ice(LBOUND(qsr_ice, dim=1) + widx1 - 1,LBOUND(qsr_ice, dim=2) + widx2 - 1,jl) * &
& (ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1) + &
& (1._wp - ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1)) * &
& (1._wp - phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) * 10._wp))
else
if (phs(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <= 0._wp .AND. &
&phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) >= 0.1_wp) then
qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = &
& qsr_ice(LBOUND(qsr_ice, dim=1) + widx1 - 1,LBOUND(qsr_ice, dim=2) + widx2 - 1,jl) * &
& ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1)
else
qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = 0._wp
end if
end if
enddo
enddo
enddo
where the latter has been manually reformatted for readability.