! if( pe.EQ.0 )call mpp_sync() !! ! Here only PE 0 reaches the barrier, where it will wait ! indefinitely. While this is a particularly egregious example to ! illustrate the coding flaw, more subtle versions of the same are ! among the most common errors in parallel code. ! ! It is therefore important to be conscious of the context of a ! subroutine or function call, and the implied synchronization. There ! are certain calls here (e.g mpp_declare_pelist, mpp_init, ! mpp_malloc, mpp_set_stack_size) which must be called by all ! PEs. There are others which must be called by a subset of PEs (here ! called a pelist) which must be called by all the PEs in the ! pelist (e.g mpp_max, mpp_sum, mpp_sync). Still ! others imply no synchronization at all. I will make every effort to ! highlight the context of each call in the MPP modules, so that the ! implicit synchronization is spelt out. ! ! For performance it is necessary to keep synchronization as limited ! as the algorithm being implemented will allow. For instance, a single ! message between two PEs should only imply synchronization across the ! PEs in question. A global synchronization (or barrier) ! is likely to be slow, and is best avoided. But codes first ! parallelized on a Cray T3E tend to have many global syncs, as very ! fast barriers were implemented there in hardware. ! ! Another reason to use pelists is to run a single program in MPMD ! mode, where different PE subsets work on different portions of the ! code. A typical example is to assign an ocean model and atmosphere ! model to different PE subsets, and couple them concurrently instead of ! running them serially. The MPP module provides the notion of a ! current pelist, which is set when a group of PEs branch off ! into a subset. Subsequent calls that omit the pelist optional ! argument (seen below in many of the individual calls) assume that the ! implied synchronization is across the current pelist. The calls ! mpp_root_pe and mpp_npes also return the values ! appropriate to the current pelist. The mpp_set_current_pelist ! call is provided to set the current pelist. !
! call mpp_error ! call mpp_error(FATAL) !! are equivalent. ! ! The argument order !
! call mpp_error( routine, errormsg, errortype ) !! is also provided to support legacy code. In this version of the ! call, none of the arguments may be omitted. ! ! The behaviour of mpp_error for a WARNING can be ! controlled with an additional call mpp_set_warn_level. !
! call mpp_set_warn_level(ERROR) !! causes mpp_error to treat WARNING ! exactly like FATAL. !
! call mpp_set_warn_level(WARNING) !! resets to the default behaviour described above. ! ! mpp_error also has an internal error state which ! maintains knowledge of whether a warning has been issued. This can be ! used at startup in a subroutine that checks if the model has been ! properly configured. You can generate a series of warnings using ! mpp_error, and then check at the end if any warnings has been ! issued using the function mpp_error_state(). If the value of ! this is WARNING, at least one warning has been issued, and ! the user can take appropriate action: ! !
! if( ... )call mpp_error( WARNING, '...' ) ! if( ... )call mpp_error( WARNING, '...' ) ! if( ... )call mpp_error( WARNING, '...' ) ! ... ! if( mpp_error_state().EQ.WARNING )call mpp_error( FATAL, '...' ) !!
! real, dimension(n) :: a ! if( pe.EQ.0 )then ! do p = 1,npes-1 ! call mpp_transmit( a, n, p, a, n, NULL_PE ) ! end do ! else ! call mpp_transmit( a, n, NULL_PE, a, n, 0 ) ! end if ! ! call mpp_transmit( a, n, ALL_PES, a, n, 0 ) !! ! The do loop and the broadcast operation above are equivalent. ! ! Two overloaded calls mpp_send and ! mpp_recv have also been ! provided. mpp_send calls mpp_transmit ! with get_pe=NULL_PE. mpp_recv calls ! mpp_transmit with put_pe=NULL_PE. Thus ! the do loop above could be written more succinctly: ! !
! if( pe.EQ.0 )then ! do p = 1,npes-1 ! call mpp_send( a, n, p ) ! end do ! else ! call mpp_recv( a, n, 0 ) ! end if !!
! use mpp_mod ! integer :: pe, chksum ! real :: a(:) ! pe = mpp_pe() ! chksum = mpp_chksum( a, (/pe/) ) !! ! The additional functionality of mpp_chksum over ! serial checksums is to compute the checksum across the PEs in ! pelist. The answer is guaranteed to be the same for ! the same distributed array irrespective of how it has been ! partitioned. ! ! If pelist is omitted, the context is assumed to be the ! current pelist. This call implies synchronization across the PEs in ! pelist, or the current pelist if pelist is absent. !
! integer :: id ! id = mpp_clock_id( 'Atmosphere' ) ! call mpp_clock_begin(id) ! call atmos_model() ! call mpp_clock_end() !! Two flags may be used to alter the behaviour of ! mpp_clock. If the flag MPP_CLOCK_SYNC is turned on ! by mpp_clock_id, the clock calls mpp_sync across all ! the PEs in the current pelist at the top of the timed code section, ! but allows each PE to complete the code section (and reach ! mpp_clock_end) at different times. This allows us to measure ! load imbalance for a given code section. Statistics are written to ! stdout by mpp_exit. ! ! The flag MPP_CLOCK_DETAILED may be turned on by ! mpp_clock_id to get detailed communication ! profiles. Communication events of the types SEND, RECV, BROADCAST, ! REDUCE and WAIT are separately measured for data volume ! and time. Statistics are written to stdout by ! mpp_exit, and individual PE info is also written to the file ! mpp_clock.out.#### where #### is the PE id given by ! mpp_pe. ! ! The flags MPP_CLOCK_SYNC and MPP_CLOCK_DETAILED are ! integer parameters available by use association, and may be summed to ! turn them both on. ! ! While the nesting of clocks is allowed, please note that turning on ! the non-optional flags on inner clocks has certain subtle issues. ! Turning on MPP_CLOCK_SYNC on an inner ! clock may distort outer clock measurements of load imbalance. Turning ! on MPP_CLOCK_DETAILED will stop detailed measurements on its ! outer clock, since only one detailed clock may be active at one time. ! Also, detailed clocks only time a certain number of events per clock ! (currently 40000) to conserve memory. If this array overflows, a ! warning message is printed, and subsequent events for this clock are ! not timed. ! ! Timings are done using the f90 standard ! system_clock_mpi intrinsic. ! ! The resolution of system_clock_mpi is often too coarse for use except ! across large swaths of code. On SGI systems this is transparently ! overloaded with a higher resolution clock made available in a ! non-portable fortran interface made available by ! nsclock.c. This approach will eventually be extended to other ! platforms. ! ! New behaviour added at the Havana release allows the user to embed ! profiling calls at varying levels of granularity all over the code, ! and for any particular run, set a threshold of granularity so that ! finer-grained clocks become dormant. ! ! The threshold granularity is held in the private module variable ! clock_grain. This value may be modified by the call ! mpp_clock_set_grain, and affect clocks initiated by ! subsequent calls to mpp_clock_id. The value of ! clock_grain is set to an arbitrarily large number initially. ! ! Clocks initialized by mpp_clock_id can set a new optional ! argument grain setting their granularity level. Clocks check ! this level against the current value of clock_grain, and are ! only triggered if they are at or below ("coarser than") the ! threshold. Finer-grained clocks are dormant for that run. ! !The following grain levels are pre-defined: ! !
!!predefined clock granularities, but you can use any integer !!using CLOCK_LOOP and above may distort coarser-grain measurements ! integer, parameter, public :: CLOCK_COMPONENT=1 !component level, e.g model, exchange ! integer, parameter, public :: CLOCK_SUBCOMPONENT=11 !top level within a model component, e.g dynamics, physics ! integer, parameter, public :: CLOCK_MODULE=21 !module level, e.g main subroutine of a physics module ! integer, parameter, public :: CLOCK_ROUTINE=31 !level of individual subroutine or function ! integer, parameter, public :: CLOCK_LOOP=41 !loops or blocks within a routine ! integer, parameter, public :: CLOCK_INFRA=51 !infrastructure level, e.g halo update !! ! Note that subsequent changes to clock_grain do not ! change the status of already initiated clocks, and that if the ! optional grain argument is absent, the clock is always ! triggered. This guarantees backward compatibility. !