Skip to content

MAPL History Component

Ben Auer edited this page Nov 10, 2021 · 45 revisions

Overview

The History component in the MAPL library is one of several specialized components provided by the library. History exists to write diagnostic data from the export state of a gridded component in a MAPL hierarchy. Note, History does not handle the checkpointing of the component states for subsequent use as restarts. That is a separate code from History. The History resource file consists of "collections" which define a group of variables and the components they can be found in that are output with identical parameters.

Input file specification

The History resource file uses the ESMF config format. The structure is built around the concept of a collection, where a collection is a set of fields that will be written to a common file stream and processed for output with the same options. The basic History resource file consists of three sections and some option keywords that apply to the output as a whole. Note that files created will be named with the EXPID+collection_name+collection_template. The following describes the options for the resource file.

Global Options

The following are global options that may be set in the resource file:

EXPID: experiment id
FileOrder: optional, sets the order of the variables in the collection in the netcdf file to alphabetical (default) and makes sure any variables that part of the metadata like lons or lats go first. If you don't want this for some reason set to "add_order" which will just put them in the order they get added to the netcdf file.

Collection List

The collection list specifies which collections to write. Even if the collection is defined in the rc file, unless it is here, it will not be written. The collection list is specified as follows:

Collections: 'collection_a'
             'collection_b'
::

Grid Labels

The grid label section provides a list of grid definitions that may be referred to in collections. These definitions define the output grid for the collection if user wants the output regridded to a different grid than the native grid the requested field is defined on. Currently this supports Lat-Lon and Cubed-Sphere grids. Each grid has the form of grid_name.option where the grid_name is what is referred to in the collection. Note that each grid definition must have a GRID_TYPE entry. The rest of the entries may varying depending on the grid type. Here is an example Lat-Lon definitions. The user specifies the longitudinal (IM_WORLD) size, the latitudinal (JM_WORLD) size, the pole (options PC or PE for pole edge and pole center), and the dateline options (DE or DC for dateline edge and dateline center).

PC96x49-DC.GRID_TYPE: LatLon
PC96x49-DC.IM_WORLD: 96
PC96x49-DC.JM_WORLD: 49
PC96x49-DC.POLE: PC
PC96x49-DC.DATELINE: DC
PC96x49-DC.LM: 72

For a complete list of supported grid types and options for each type see the following page about creating grids from an ESMF_Config (which what the History.rc file is): https://github.com/GEOS-ESM/MAPL/wiki/Creating-Grids-with-MAPL-Grid-Factories

The actual grid to grid transformation is performed using ESFM and we currently support bilinear and first order conservative. For more information see: ESMF Regridding

Collections

coll_name.template: grads style template that defineds time characterstics of the output file, e.g. %y4%m2%d2_%h2%n2z.nc4
coll_name.format: output file format, 'flat' binary or 'CFIO' netcdf, optional, default 'flat'
coll_name.mode: controls time output, whether to time average or write instantaneous values. Options 'instantaneous' (default) or 'time-averaged'
coll_name.frequency: time interval in HHMMSS format, frequency collection will be written
coll_name.duration: time interval in HHMMSS format, define how long to write to the current file before creating a new file, by default duration is the freuqency for only one time will be written to each file
coll_name.grid_label: grid definition to use for the output regridding
coll_name.vscale:
coll_name.vunit:
coll_name.vvars:
coll_name.levels:
coll_name.ref_time: time in HHMMSS format, optional, reference time used in conjunction with ref_date and frequency to determine when to write, optional, default 000000
coll_name.ref_date: date in YYYYMMDD format, optional, reference date used in conjunction with ref_time and frequency to determine when to write, optional, defaults to the date of the application clock
coll_name.end_date: date in YYYYMMDD format, optional, turns off collection at this date, by default no end date
coll_name.end_time: time in HHMMSS format, optional, turns off collection at this time, by default no end time
coll_name.regrid_name:
coll_name.regrid_exch:
coll_name.fields: Definition of the fields that make up the collection, described later
coll_name.monthly:
coll_name.splitField:
coll_name.UseRegex:
coll_name.nbit: bit shaving, integer, optional, if not present, no bit shaving, otherwise integer, retain that many bits of the mantissa, useful for better compression
coll_name.deflate: netcdf compression level, default 0, can be 0-9
coll_name.chunksize: netcdf chunking, by default the chunksizes will match the dimension, otherwise must be a list of comma separated numbers that match the number of dimensions in the output file. For example, suppose you are outputting on a 180x90 lat-lon grid, an there are 3D variables in the file, the file will have 4 dimensions, lon,lat,lev,time so you could say 90,45,1,1
coll_name.conservative: use conservative regridding, default 0, 0 - bilinear, 1 - conservative

The fields entry is described in more detail here as it has several options. The entry can consist of multiple lines, each of which may have two to four entries. For example:

  geosgcm_prog.fields:    'PHIS'     , 'AGCM'         ,
                          'SLP'      , 'DYN'          ,
                          'U;V'      , 'DYN'          ,
                          'ZLE'      , 'DYN'          , 'H'   ,
                          'OMEGA'    , 'DYN'          ,
                          'Q'        , 'MOIST'        , 'QV'  ,      
                          ::

Each line consists of the follow:

  • short_name of the variable in the gridded component
  • name of component the variable may be found in
  • optional name to use in the output file in place of the short_name
  • optional modification to the coupler if time averaging. By default the coupler time averaged over the iterval, set to 'MIN' or 'MAX' if you want the minimum or maximum in the interval. Note that in the example above the entry with U;V. This denotes that the two variables separated by the ; represent a vector pair and if regridded to a new should should be handled accordingly.

Collection Keyword Descriptions

  • conservative: use conservative regridding, default 0, 0 - bilinear, 1 - conservative
  • deflate: defaults to 0, deflation level used in NetCDF
  • frequency: this is the frequency to output the collection in HHMMSS format.
  • levels: list of space separated levels to output.
    If no vvars option is specified these are the actual level indices in a fortran sense. For example if you specify 1 2 3 this would output the levels indexed by 1,2, and 3 in the undistributed dimension in the underlying fortran array. If vvars is specified then these are the levels that will be interpolated to and output matching the type represented by vvars. For example if ZLE is specified as vvars, for levels you could specify something like this 10 20 50 100 1000 which would be the heights in meters you want output.
  • mode: 'instantaneous' (default) or 'time-averaged', either time average the fields between writes or just output the instantaneous value.
  • nbits: this performs "bit shaving" and sets 24-nbits of the mantissa for each value output to zero. This helps compression at the loss of some information

Advanced options

Vertical regridding

Expression in History

Output variables in Bundles

Splitting 4-D fields

Outputting monthly data

Tips for History

  • Unless you are interpolating to a set of levels, you can not mix variables that are defined on the center and edge in the vertical in a collection as only one vertical coordinate may be defined in the output. If you want to output both center and edge variables on the native levels, you must write two collections.
Clone this wiki locally