netCDF-schema CF 1.11 implementation #357
gtesoro
started this conversation in
Comments and ideas for changing CF
Replies: 1 comment
-
Description updated |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Concept
The netCDF-Schema is new tool development aimed to provide an easy way to define and validate netCDF products.
The idea behind it is to have schemas, defined as
yaml
files, that describe the structure and relationships of a netCDF file. This schema follows the structure of a validation model, which will be used by the tool for validation.The goal for the initial implementation is to have a validation model with all the necessary features so the CF Conventions (version 1.11 at the moment of writing) can be translated into an schema.
Here is the latest validation model and CF-11 schema.
How it works
Schemas have a root where all
groups
under validation are defined. Each of thesegroups
may contain entries fordimensions
,attributes
andvariables
. Group name can be literals or regex expression to match more complex paths.Within each of the
elements
, validation items can be defined. This items are composed ofcriteria
and/orschema
.criteria
represent a selector, matching elements of the given type will be filtered based on the contents. Once this elements are selected, theschema
will be applied to each of them, raising errors if not all the selected elements comply with it. This structure allows for conditional validation (i.e.: if variable has attribute X it also needs attribute Y).If
criteria
is not present,schema
will be applied to all the elements in the file.If
schema
is not present, onlycount
check will be applied.Comparators
The current validation model provides many ad-hoc comparators to validate complex relationships on a netCDF model, particularly in the context of CF. Most are self-explanatory but there are others that require explanation:
Parse
parse
can be applied to any string value. It will split the string according to the provided regular expression and for each result, apply a given comparator. In the example above, the contents of thecoordinates
attribute will be split into words.Find
The
find
operator will tokenize the left-side value that is applied to and use it to find an element, using said token for any of the fields. The example above uses the each parse entry and looks for a variable whichname
matches it.udunits
udunits
operator allows for easy checks against Unidata UDUNITS package.Locator
A
locator
allows to retrieve a value relative to the matched element. In the example above, we retrieve the name of the variable to check it against the dimension name.What's next
The current goal is to asses if the
validation model
requires any additional features to fully allow for the implementation of CF-11 as an schema.My intention with this post is to open the discussion for further improvements or feature request, as well as tapping into your CF expertise to identify potential pitfalls when implementing CF as an schema with the current model.
Once the
validation model
has been consolidated (I'm considering a soft deadline of end of October), I will start with the implementation of the tool to actually run the validations, aiming for 1.0.0 release before the end of the year.Beta Was this translation helpful? Give feedback.
All reactions