.. _using_psyclone: ****************** Using PSyclone ****************** .. contents:: :local: :depth: 1 Overview ======== This section contains step-by-step instructions that demonstrate an application of the PSyclone-based source-code processing option available in the NEMO build system. It assumes that PSyclone has been correctly installed as detailed in the NEMO :doc:`installation guide ` and that an appropriate arch file has been generated (hereafter referred to as ``arch-auto.fcm``). This initial application is simply a PSyclone passthrough of a NEMO configuration. That is, the source code is processed through PSyclone but the resultant code is functionally equivalent to the original source (although standardisation of some F90 constructs will be carried out). Later additions to this guide will illustrate how to add transformation scripts to this process to perform complex tasks such as identifying computational kernels and inserting OpenACC directives for GPU offloading. `PSyclone `_ processing of the NEMO source code is available as an option in the NEMO build system, and PSyclone transformations can be enabled via option:: -p of the ``./makenemo`` command (``-p all`` lists the available PSyclone transformations). The different options correspond to transformation scripts in directory ``sct/`` with the exception of ``passthrough`` (which is supported internally). This initial part of PSyclone-processing guide demonstrates the use of the ``passthrough`` option from scratch, including the compilation of a model configuration with and without PSyclone passthrough, running of the model, and verification of the passthrough. Compilation and running of the BENCH test case ============================================== Step 1 - Compile the BENCH test case with and without PSyclone passthrough -------------------------------------------------------------------------- At the top level of the NEMO repository, :: $ cd nemo a reference configuration of the BENCH test case can be compiled:: $ ./makenemo -m auto -a BENCH -n BENCH_0 -j 8 -v 1 Next, a corresponding configuration with PSyclone passthrough can be built:: $ ./makenemo -m auto -a BENCH -n BENCH_PT -j 8 -v 1 -p passthrough For demonstration purposes, a simple, non-invasive PSyclone transformation script can alternatively be enabled:: $ ./makenemo -m auto -a BENCH -n BENCH_INFO -j 8 -v 1 -p list_symbols This variant produces the same source code as the passthrough, but during the build process it outputs the names of all variables defined in the majority of the Fortran modules and in the associated module procedures. Step 2 - Prepare a suitable configuration and submission script --------------------------------------------------------------- For testing, a BENCH configuration of a small domain (ORCA2 equivalent) and a low number of time steps (80) can be generated as:: $ sed -e 's/nn_itend.*/nn_itend=80/' -e 's/nn_isize.*/nn_isize=180/' -e 's/nn_jsize.*/nn_jsize=148/' -e 's/nn_ksize.*/nn_ksize=31/' -e 's/ln_timing.*/ln_timing=.true./' -e '/\&namctl/asn_cfctl%l_runstat=.true.' tests/BENCH/EXPREF/namelist_cfg_orca1_like > ./namelist_cfg Step 3 - Prepare a submission script (if required) -------------------------------------------------- In principle, inside the experiment directories ``tests/BENCH_{0,PT,INFO}/EXP00/`` it would suffice to run the model as ``mpirun -n 4 ./nemo`` or similarly, but assuming that NEMO runs are typically submitted on a HPC system via a job scheduler, script ``./submit.sh`` will be assumed to contain the necessary system-specific settings and commands (in effect starting the executable ``./nemo`` in an MPI environment) in the next step:: $ vi submit.sh; chmod u+x ./submit.sh Step 4 - Run NEMO ``BENCH_0`` and ``BENCH_PT`` ---------------------------------------------- Next, the two model runs can be started:: $ cp namelist_cfg submit.sh tests/BENCH_0/EXP00/ $ cd tests/BENCH_0/EXP00/ $ ./submit.sh $ cd - $ cp namelist_cfg submit.sh tests/BENCH_PT/EXP00/ $ cd tests/BENCH_PT/EXP00/ $ ./submit.sh $ cd - Verification and source-code inspection ======================================= Step 5 - Verify the PSyclone passthrough ---------------------------------------- Once the runs have finished, comparison of model output from the model builds with and without PSyclone passthrough, :: $ vimdiff tests/BENCH_{0,PT}/EXP00/run.stat should (hopefully) reveal identical results. Step 6 - Inspect the source code for the effect of the PSyclone passthrough --------------------------------------------------------------------------- With PSyclone processing, the build system processes the NEMO source code in three stages (or with AGRIF in four stages): CPP preprocessing, PSyclone processing, and the actual compilation. For the example of the `BENCH_PT` configuration, the original source-code files are linked to in directory ``tests/BENCH_PT/WORK``, the CPP preprocessed versions can be found in directory ``tests/BENCH_ST/BLD_SCT_PSYCLONE/ppsrc/nemo/``, and the PSyclone processed files supplied to the Fortran compiler are at ``tests/BENCH_ST/BLD_SCT_PSYCLONE/obj/``. For example, differences between the three source-code variants for module ``usrdef_sbc`` can be visualised with:: $ vimdiff tests/BENCH_PT/WORK/usrdef_sbc.F90 tests/BENCH_PT/BLD_SCT_PSYCLONE/{ppsrc/nemo,obj}/usrdef_sbc.f90 The substantial transformation of the original ``WHERE`` construct starting at line 178 of the original file in this example demonstrates the normalisation aspect of the PSyclone processing stage. The various build stages are illustrated by this example which takes the code block from its original: .. code-block:: fortran DO jl = 1, jpl WHERE ( phs(A2D(0),jl) <= 0._wp .AND. phi(A2D(0),jl) < 0.1_wp ) ! linear decrease from hi=0 to 10cm qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ( ztri(:,:) + ( 1._wp - ztri(:,:) ) * ( 1._wp - phi(A2D(0),jl) * 10._wp ) ) ELSEWHERE( phs(A2D(0),jl) <= 0._wp .AND. phi(A2D(0),jl) >= 0.1_wp ) ! constant (ztri) when hi>10cm qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ztri(:,:) ELSEWHERE ! zero when hs>0 qtr_ice_top(:,:,jl) = 0._wp END WHERE ENDDO through normal CPP macro expansion, to: .. code-block:: fortran DO jl = 1, jpl WHERE ( phs(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <= 0._wp .AND. phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) < 0.1_wp ) ! linear decrease from hi=0 to 10cm qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ( ztri(:,:) + ( 1._wp - ztri(:,:) ) * ( 1._wp - phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) * 10._wp ) ) ELSEWHERE( phs(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <= 0._wp .AND. phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) >= 0.1_wp ) ! constant (ztri) when hi>10cm qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ztri(:,:) ELSEWHERE ! zero when hs>0 qtr_ice_top(:,:,jl) = 0._wp END WHERE ENDDO and PSyclone transformation to: .. code-block:: fortran do jl = 1, jpl, 1 do widx2 = 1, Nje0 + 0 - (Njs0 - 0) + 1, 1 do widx1 = 1, Nie0 + 0 - (Nis0 - 0) + 1, 1 if (phs(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <= 0._wp .AND. & &phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) < 0.1_wp) then qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = & & qsr_ice(LBOUND(qsr_ice, dim=1) + widx1 - 1,LBOUND(qsr_ice, dim=2) + widx2 - 1,jl) * & & (ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1) + & & (1._wp - ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1)) * & & (1._wp - phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) * 10._wp)) else if (phs(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <= 0._wp .AND. & &phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) >= 0.1_wp) then qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = & & qsr_ice(LBOUND(qsr_ice, dim=1) + widx1 - 1,LBOUND(qsr_ice, dim=2) + widx2 - 1,jl) * & & ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1) else qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = 0._wp end if end if enddo enddo enddo where the latter has been manually reformatted for readability.