Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow provider-specific data types #529

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions optimade.rst
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ Hence, entry properties are described in this proposal using
context-independent types that are assumed to have some form of
representation in all contexts. They are as follows:

- Basic types: **string**, **integer**, **float**, **boolean**, **timestamp**.
- Basic types: **string**, **integer**, **float**, **boolean**, **timestamp**, database-provider-specific or definition-provider-specific data type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same database-provider-specific or definition-provider-specific data type phrase is repeated multiple times throughout the specification. I like that we are being really specific, but maybe we could defined a term somewhere (e.g. custom data type) and then refer to it troughout the text?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, I will check the term definition section to see how well it would fit there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On another point: I think it seems odd to list provider-specific data types under "Basic types". To the extent that we do the separation into Basic vs. list/dictionary on the basis of "contains one thing" vs "many things", it also isn't clear to me why the defined types would have to be seen as the former. I suggest we move the segment about database-specific datatypes below the definitions of the datatypes that are explicitly defined by the standard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is to allow only those provider-specific data types that can be expressed as strings (e.g., symmetry operators, complex numbers etc). Timestamp type which OPTIMADE already has could fall under the same category of strings with internal semantics. That is why I lumped provider-specific data types together with them. But I agree that they could be split off to a separate segment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same database-provider-specific or definition-provider-specific data type phrase is repeated multiple times throughout the specification. I like that we are being really specific, but maybe we could defined a term somewhere (e.g. custom data type) and then refer to it troughout the text?

What about calling them "namespace-specific data types"? The specification already has a section "Namespace Prefixes" which is later split into database-specific and definition provider-specific ones, thus I think "namespace" is a right term to use.

- **list**: an ordered collection of items, where all items are of the same type, unless they are unknown.
A list can be empty, i.e., contain no items.
- **dictionary**: an associative array of **keys** and **values**, where **keys** are pre-determined strings, i.e., for the same entry property, the **keys** remain the same among different entries whereas the **values** change.
Expand Down Expand Up @@ -627,6 +627,7 @@ In the JSON response format, property types translate as follows:
- **timestamp** uses a string representation of date and time as defined in `RFC 3339 Internet Date/Time Format <https://tools.ietf.org/html/rfc3339#section-5.6>`__.
- **dictionary** is represented by the JSON object type.
- **unknown** properties are represented by either omitting the property or by a JSON :field-val:`null` value.
- database-provider-specific or definition-provider-specific data types use string representations.

Every response SHOULD contain the following fields, and MUST contain at least :field:`meta`:

Expand Down Expand Up @@ -1816,6 +1817,7 @@ The following tokens are used in the filter query component:
(Note that at the end of the string value above the four final backslashes represent the two terminal backslashes in the value, and the final double quote is a terminator, it is not escaped.)

String value tokens are also used to represent **timestamps** in form of the `RFC 3339 Internet Date/Time Format <https://tools.ietf.org/html/rfc3339#section-5.6>`__.
String value tokens as well are used to represent database-provider-specific or definition-provider-specific data types.

- **Numeric values** are represented as decimal integers or in scientific notation, using the usual programming language conventions.
A regular expression giving the number syntax is given below as a `POSIX Extended Regular Expression (ERE) <https://en.wikipedia.org/w/index.php?title=Regular_expression&oldid=786659796#Standards>`__ or as a `Perl-Compatible Regular Expression (PCRE) <http://www.pcre.org>`__:
Expand Down Expand Up @@ -2070,6 +2072,10 @@ As the filter language syntax does not define a lexical token for timestamps, va
In a comparison with a timestamp property, a string token represents a timestamp value that would result from parsing the string according to RFC 3339 Internet Date/Time Format.
Interpretation failures MUST be reported with error :http-error:`400 Bad Request`.

Database and definition providers MAY introduce custom types, representing them with string lexical tokens both in filters and responses.
It is up to the providers to decide which comparison operators to support and how the comparisons should be performed.
For example, if a provider intoduces a set-valued property :property:`_exmpl_set`, it may decide to define operator :val:`CONTAINS` so that :filter:`identifier CONTAINS set` is true if :val:`set` is a subset of a property.
merkys marked this conversation as resolved.
Show resolved Hide resolved

Optional filter features
~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -2158,7 +2164,7 @@ A Property Definition MUST be composed according to the combination of the requi

- :field:`x-optimade-type`: String.
Specifies the OPTIMADE data type for this level of the defined property.
MUST be one of :val:`"string"`, :val:`"integer"`, :val:`"float"`, :val:`"boolean"`, :val:`"timestamp"`, :val:`"list"`, or :val:`"dictionary"`.
MUST be one of :val:`"string"`, :val:`"integer"`, :val:`"float"`, :val:`"boolean"`, :val:`"timestamp"`, :val:`"list"`, :val:`"dictionary"` or a string naming database-provider-specific or definition-provider-specific data type starting with provider-specific prefix.

- :field:`x-optimade-unit`: String.
A (compound) symbol for the physical unit in which the value of the defined property is given or one of the strings :val:`dimensionless` or :val:`inapplicable`.
Expand Down
Loading