-
Notifications
You must be signed in to change notification settings - Fork 19
Thread safe Gridded Components Checklist
Subtle issues in the implementation of gridded components can cause them to be not thread-safe. The checklist below should be examined for each gridded component:
Thread-safe components must not use any shared variables with the possible exceptions being within !$omp master
and !$omp single
sections. And while OpenMP will generally treat local variables of procedures called from within a parallel region as thread private, it cannot do this with SAVE
d Fortran variables.
These are relatively rare in GEOS, but developers should still verify that their components do not use the SAVE attribute.
Module variables, i.e., variables declared in the specification section of a module, implicitly have the SAVE attribute. These include module variables declared with allocatable
. Declaring derived types in the specification is perfectly fine and appropriate.
Fortran variables that have a default initialization, e.g.,
integer :: i = 0
logical :: init = .false.
real, pointer :: q(:,:,:) => null()
also implicitly have the SAVE attribute. The use of null()
to initialize pointers is particularly surprising to some developers, and rather annoying as otherwise pointers start out in an undefined status.
The solution is generally to delete the initialization portion of the variable declaration and instead initialize the variable at the beginning of the executable section of the procedure. For example:
integer :: i
logical :: init
real, pointer :: q(:,:,:)
i = 0
init = .false.
q => null()
Note that this fix may not be equivalent to the original implementation. Consider the case above where init
was being used to ensure that some expensive operation is only performed during the first call to a procedure. Moving that initialization to the beginning of the execution section will result in such expensive operations being performed on every invocation of that procedure. (So what is the solution?)
Consider the following snippet:
type MyType
real, pointer :: q(:,:,:) => null()
end type MyType
...
type(MyType) :: var1
type(MyType), pointer p_var2
type(MyType), allocatable :: var3
Because the derived type MyType
has a default initialized component (in this case a pointer q
default initialized to null()
), variables of that type also have default initialization and implicitly have the SAVE attribute. In particular var1
above is not thread safe. This can be very difficult to diagnose as the type definition may be in a different file or even in an external library.
Note that p_var2
and var3
in the above snippet do not have the SAVE attribute. So using dynamic allocation provides a potential workaround for cases where a type has a default-initialized component and/or this is unknown.
Another way in which a gridded component can run afoul of thread safety is to make calls into other layers which are not themselves threadsafe.
The largest concern here is I/O which can call system-level libraries which are not thread safe. Usually the solution for local I/O is to surround the print/write statements with OpenMP guards such as:
!$omp critical (my_print)
print*,...
!$omp end critical (my_print)
Note that it is very important to name critical sections in GEOS to avoid accidental deadlock when critical sections are nested.
Of course if you only want one thread to do the I/O, you should instead use either the single
or master
directives instead. (And these do not have names.)
Unless you are certain that a given layer is threadsafe itself, you should surround calls with OpenMP critical sections as described above for I/O. Ideally someone should fix said layer and then remove the critical section. Best practice is to comment the reason for the critical section so later developers can check to see if it is no longer necessary.
Notable layers which are currently not thread safe:
- pFlogger (output on shared files/units)
- Timers (use global counters sometimes)
- MAPL and ESMF GetResource on the universal (shared) config.
Future improvements are expected to extend pFlogger and MAPL_Profiler to be thread safe and hopefully even thread-aware. However, the universal config will likely remain a shared resource. Best practice is to limit access to the initialize phase and surround in a critical section. Setting values into a config may require an openmp barrier depending on subsequent use.
OpenMP and Fortran evolve separately as standards. A consequence of this is that OpenMP is generally somewhat behind the Fortran standard in terms of specifying what language features can be used within parallel regions.
BLOCK was introduced in F2008 and is not allowed (as of 2021-08-13) within a parallel region. This feature is not frequently used within GEOS, and any such can readily be replaced with either inline code or changing into a module procedure or an inner procedure.
The GNU Fortran compiler (gfortran) does not allow a RETURN
statement from within an OpenMP block. For example, the compiler complains about the implicit return statement in VERIFY_(STATUS)
!$omp critical
call ESMF_ConfigGetAttribute(CF, DT, Label="RUN_DT:" , RC=STATUS)
VERIFY_(STATUS)
!$omp end critical
The work around is to move VERIFY_(STATUS)
to after !$omp end critical
.
If the block is longer with multiple VERIFY/RC, then do
call first(..., rc=status)
if (status == _SUCCESS) call second(..., rc=status)
if (status == _SUCCESS) call third(..., rc=status)
!$omp end critical
_VERIFY(status)