Skip to content

Latest commit

 

History

History
828 lines (681 loc) · 30.6 KB

registry.md

File metadata and controls

828 lines (681 loc) · 30.6 KB

MessageFormat 2.0 Default Function Registry

This section describes the functions for which each implementation MUST provide a function handler to be conformant with this specification.

Implementations MAY implement additional functions or additional options. In particular, implementations are encouraged to provide feedback on proposed options and their values.

Note

The Stability Policy allows for updates to Default Registry functions to add support for new options. As implementations are permitted to ignore options that they do not support, it is possible to write messages using options not defined below which currently format with no error, but which could produce errors when formatted with a later edition of the Default Registry. Therefore, using options not explicitly defined here is NOT RECOMMENDED.

String Value Selection and Formatting

The :string function

The function :string provides string selection and formatting.

Operands

The operand of :string is either any implementation-defined type that is a string or for which conversion to a string is supported, or any literal value. All other values produce a Bad Operand error.

For example, in Java, implementations of the java.lang.CharSequence interface (such as java.lang.String or java.lang.StringBuilder), the type char, or the class java.lang.Character might be considered as the "implementation-defined types". Such an implementation might also support other classes via the method toString(). This might be used to enable selection of a enum value by name, for example.

Other programming languages would define string and character sequence types or classes according to their local needs, including, where appropriate, coercion to string.

Options

The function :string has no options.

Note

Proposals for string transformation options or implementation experience with user requirements is desired during the Tech Preview.

Selection

When implementing MatchSelectorKeys(resolvedSelector, keys) where resolvedSelector is the resolved value of a selector and keys is a list of strings, the :string selector function performs as described below.

  1. Let compare be the string value of resolvedSelector in Unicode Normalization Form C (NFC) [UAX#15]
  2. Let result be a new empty list of strings.
  3. For each string key in keys:
    1. If key and compare consist of the same sequence of Unicode code points, then
      1. Append key as the last element of the list result.
  4. Return result.

Note

Unquoted string literals in a variant do not include spaces. If users wish to match strings that include whitespace (including U+3000 IDEOGRAPHIC SPACE) to a key, the key needs to be quoted.

For example:

.input {$string :string}
.match $string
| space key | {{Matches the string " space key "}}
*             {{Matches the string "space key"}}

Formatting

The :string function returns the string value of the resolved value of the operand.

Note

The function :string does not perform Unicode Normalization of its formatted output. Users SHOULD encode messages and their parts in Unicode Normalization Form C (NFC) unless there is a very good reason not to.

Numeric Value Selection and Formatting

The :number function

The function :number is a selector and formatter for numeric values.

Operands

The function :number requires a Number Operand as its operand.

Options

Some options do not have default values defined in this specification. The defaults for these options are implementation-dependent. In general, the default values for such options depend on the locale, the value of other options, or both.

Note

The names of options and their values were derived from the options in JavaScript's Intl.NumberFormat.

The following options and their values are required to be available on the function :number:

If the operand of the expression is an implementation-defined type, such as the resolved value of an expression with a :number or :integer annotation, it can include option values. These are included in the resolved option values of the expression, with options on the expression taking priority over any option values of the operand.

For example, the placeholder in this message:

.input {$n :number notation=scientific minimumFractionDigits=2}
{{{$n :number minimumFractionDigits=1}}}

would be formatted with the resolved options { notation: 'scientific', minimumFractionDigits: '1' }.

Note

The following options and option values are being developed during the Technical Preview period.

The following values for the option style are not part of the default registry. Implementations SHOULD avoid creating options that conflict with these, but are encouraged to track development of these options during Tech Preview:

  • currency
  • unit

The following options are not part of the default registry. Implementations SHOULD avoid creating options that conflict with these, but are encouraged to track development of these options during Tech Preview:

  • currency
  • currencyDisplay
    • symbol (default)
    • narrowSymbol
    • code
    • name
  • currencySign
    • accounting
    • standard (default)
  • unit
    • (anything not empty)
  • unitDisplay
    • long
    • short (default)
    • narrow
Default Value of select Option

The value plural is the default for the option select because it is the most common use case for numeric selection. It can be used for exact value matches but also allows for the grammatical needs of languages using CLDR's plural rules. This might not be noticeable in the source language (particularly English), but can cause problems in target locales that the original developer is not considering.

For example, a naive developer might use a special message for the value 1 without considering a locale's need for a one plural:

.input {$var :number}
.match $var
1   {{You have one last chance}}
one {{You have {$var} chance remaining}}
*   {{You have {$var} chances remaining}}

The one variant is needed by languages such as Polish or Russian. Such locales typically also require other keywords such as two, few, and many.

Percent Style

When implementing style=percent, the numeric value of the operand MUST be multiplied by 100 for the purposes of formatting.

For example,

The total was {0.5 :number style=percent}.

should format in a manner similar to:

The total was 50%.

Selection

The function :number performs selection as described in Number Selection below.

Composition

When an operand or an option value uses a variable annotated, directly or indirectly, by a :number annotation, its resolved value contains an implementation-defined numerical value of the operand of the annotated expression, together with the resolved options' values.

The :integer function

The function :integer is a selector and formatter for matching or formatting numeric values as integers.

Operands

The function :integer requires a Number Operand as its operand.

Options

Some options do not have default values defined in this specification. The defaults for these options are implementation-dependent. In general, the default values for such options depend on the locale, the value of other options, or both.

Note

The names of options and their values were derived from the options in JavaScript's Intl.NumberFormat.

The following options and their values are required in the default registry to be available on the function :integer:

If the operand of the expression is an implementation-defined type, such as the resolved value of an expression with a :number or :integer annotation, it can include option values. In general, these are included in the resolved option values of the expression, with options on the expression taking priority over any option values of the operand. Option values with the following names are however discarded if included in the operand:

  • compactDisplay
  • notation
  • minimumFractionDigits
  • maximumFractionDigits
  • minimumSignificantDigits

Note

The following options and option values are being developed during the Technical Preview period.

The following values for the option style are not part of the default registry. Implementations SHOULD avoid creating options that conflict with these, but are encouraged to track development of these options during Tech Preview:

  • currency
  • unit

The following options are not part of the default registry. Implementations SHOULD avoid creating options that conflict with these, but are encouraged to track development of these options during Tech Preview:

  • currency
  • currencyDisplay
    • symbol (default)
    • narrowSymbol
    • code
    • name
  • currencySign
    • accounting
    • standard (default)
  • unit
    • (anything not empty)
  • unitDisplay
    • long
    • short (default)
    • narrow
Default Value of select Option

The value plural is the default for the option select because it is the most common use case for numeric selection. It can be used for exact value matches but also allows for the grammatical needs of languages using CLDR's plural rules. This might not be noticeable in the source language (particularly English), but can cause problems in target locales that the original developer is not considering.

For example, a naive developer might use a special message for the value 1 without considering a locale's need for a one plural:

.input {$var :integer}
.match $var
1   {{You have one last chance}}
one {{You have {$var} chance remaining}}
*   {{You have {$var} chances remaining}}

The one variant is needed by languages such as Polish or Russian. Such locales typically also require other keywords such as two, few, and many.

Percent Style

When implementing style=percent, the numeric value of the operand MUST be multiplied by 100 for the purposes of formatting.

For example,

The total was {0.5 :number style=percent}.

should format in a manner similar to:

The total was 50%.

Selection

The function :integer performs selection as described in Number Selection below.

Composition

When an operand or an option value uses a variable annotated, directly or indirectly, by a :integer annotation, its resolved value contains the implementation-defined integer value of the operand of the annotated expression, together with the resolved options' values.

Number Operands

The operand of a number function is either an implementation-defined type or a literal whose contents match the number-literal production in the ABNF. All other values produce a Bad Operand error.

For example, in Java, any subclass of java.lang.Number plus the primitive types (byte, short, int, long, float, double, etc.) might be considered as the "implementation-defined numeric types". Implementations in other programming languages would define different types or classes according to their local needs.

Note

String values passed as variables in the formatting context's input mapping can be formatted as numeric values as long as their contents match the number-literal production in the ABNF.

For example, if the value of the variable num were the string -1234.567, it would behave identically to the local variable in this example:

.local $example = {|-1234.567| :number}
{{{$num :number} == {$example}}}

Note

Implementations are encouraged to provide support for compound types or data structures that provide additional semantic meaning to the formatting of number-like values. For example, in ICU4J, the type com.ibm.icu.util.Measure can be used to communicate a value that includes a unit or the type com.ibm.icu.util.CurrencyAmount can be used to set the currency and related options (such as the number of fraction digits).

Digit Size Options

Some options of number functions are defined to take a "digit size option". The function handlers for number functions use these options to control aspects of numeric display such as the number of fraction, integer, or significant digits.

A "digit size option" is an option value that the function interprets as a small integer value greater than or equal to zero. Implementations MAY define an upper limit on the resolved value of a digit size option option consistent with that implementation's practical limits.

In most cases, the value of a digit size option will be a string that encodes the value as a non-negative integer. Implementations MAY also accept implementation-defined types as the value. When provided as a string, the representation of a digit size option matches the following ABNF:

digit-size-option = "0" / (("1"-"9") [DIGIT])

If the value of a digit size option does not evaluate as a non-negative integer, or if the value exceeds any implementation-defined upper limit or any option-specific lower limit, a Bad Option Error is emitted.

Number Selection

Number selection has three modes:

  • exact selection matches the operand to explicit numeric keys exactly
  • plural selection matches the operand to explicit numeric keys exactly followed by a plural rule category if there is no explicit match
  • ordinal selection matches the operand to explicit numeric keys exactly followed by an ordinal rule category if there is no explicit match

When implementing MatchSelectorKeys(resolvedSelector, keys) where resolvedSelector is the resolved value of a selector and keys is a list of strings, numeric selectors perform as described below.

  1. Let exact be the JSON string representation of the numeric value of resolvedSelector. (See Determining Exact Literal Match for details)
  2. Let keyword be a string which is the result of rule selection on resolvedSelector.
  3. Let resultExact be a new empty list of strings.
  4. Let resultKeyword be a new empty list of strings.
  5. For each string key in keys:
    1. If the value of key matches the production number-literal, then
      1. If key and exact consist of the same sequence of Unicode code points, then
        1. Append key as the last element of the list resultExact.
    2. Else if key is one of the keywords zero, one, two, few, many, or other, then
      1. If key and keyword consist of the same sequence of Unicode code points, then
        1. Append key as the last element of the list resultKeyword.
    3. Else, emit a Bad Variant Key error.
  6. Return a new list whose elements are the concatenation of the elements (in order) of resultExact followed by the elements (in order) of resultKeyword.

Note

Implementations are not required to implement this exactly as written. However, the observed behavior must be consistent with what is described here.

Rule Selection

Rule selection is intended to support the grammatical matching needs of different languages/locales in order to support plural or ordinal numeric values.

If the option select is set to exact, rule-based selection is not used. Otherwise rule selection matches the operand, as modified by function options, to exactly one of these keywords: zero, one, two, few, many, or other. The keyword other is the default.

Note

Since valid keys cannot be the empty string in a numeric expression, returning the empty string disables keyword selection.

The meaning of the keywords is locale-dependent and implementation-defined. A key that matches the rule-selected keyword is a stronger match than the fallback key * but a weaker match than any exact match key value.

The rules for a given locale might not produce all of the keywords. A given operand value might produce different keywords depending on the locale.

Apply the rules to the resolved value of the operand and the relevant function options, and return the resulting keyword. If no rules match, return other.

If the option select is set to plural, the rules applied to selection SHOULD be the CLDR plural rule data of type cardinal. See charts for examples.

If the option select is set to ordinal, the rules applied to selection SHOULD be the CLDR plural rule data of type ordinal. See charts for examples.

Example. In CLDR 44, the Czech (cs) plural rule set can be found here.

A message in Czech might be:

.input {$numDays :number}
.match $numDays
one  {{{$numDays} den}}
few  {{{$numDays} dny}}
many {{{$numDays} dne}}
*    {{{$numDays} dní}}

Using the rules found above, the results of various operand values might look like:

Operand value Keyword Formatted Message
1 one 1 den
2 few 2 dny
5 other 5 dní
22 few 22 dny
27 other 27 dní
2.4 many 2,4 dne

Determining Exact Literal Match

Important

The exact behavior of exact literal match is currently only well defined for non-zero-filled integer values. Functions that use fraction digits or significant digits might work in specific implementation-defined ways. Users should avoid depending on these types of keys in message selection in this release.

Number literals in the MessageFormat 2 syntax use the format defined for a JSON number. A resolvedSelector exactly matches a numeric literal key if, when the numeric value of resolvedSelector is serialized using the format for a JSON number, the two strings are equal.

Note

The above description of numeric matching contains open issues in the Technical Preview, since a given numeric value might be formatted in several different ways under RFC8259 and since the effect of formatting options, such as the number of fraction digits or significant digits, is not described. The Working Group intends to address these issues before final release with a number of design options being considered.

Users should avoid creating messages that depend on exact matching of non-integer numeric values. Feedback, including use cases encountered in message authoring, is strongly desired.

Date and Time Value Formatting

This subsection describes the functions and options for date/time formatting. Selection based on date and time values is not required in this release.

Note

Selection based on date/time types is not required by MF2. Implementations should use care when defining selectors based on date/time types. The types of queries found in implementations such as java.time.TemporalAccessor are complex and user expectations may be inconsistent with good I18N practices.

The :datetime function

The function :datetime is used to format date/time values, including the ability to compose user-specified combinations of fields.

If no options are specified, this function defaults to the following:

  • {$d :datetime} is the same as {$d :datetime dateStyle=medium timeStyle=short}

Note

The default formatting behavior of :datetime is inconsistent with Intl.DateTimeFormat in JavaScript and with {d,date} in ICU MessageFormat 1.0. This is because, unlike those implementations, :datetime is distinct from :date and :time.

Operands

The operand of the :datetime function is either an implementation-defined date/time type or a date/time literal value, as defined in Date and Time Operand. All other operand values produce a Bad Operand error.

Options

The :datetime function can use either the appropriate style options or can use a collection of field options (but not both) to control the formatted output.

If both are specified, a Bad Option error MUST be emitted and a fallback value used as the resolved value of the expression.

If the operand of the expression is an implementation-defined date/time type, it can include style options, field options, or other option values. These are included in the resolved option values of the expression, with options on the expression taking priority over any option values of the operand.

Note

The names of options and their values were derived from the options in JavaScript's Intl.DateTimeFormat.

Style Options

The function :datetime has these style options.

  • dateStyle
    • full
    • long
    • medium
    • short
  • timeStyle
    • full
    • long
    • medium
    • short
Field Options

Field options describe which fields to include in the formatted output and what format to use for that field.

Note

Field options do not have default values because they are only to be used to compose the formatter.

The field options are defined as follows:

Important

The value 2-digit for some field options must be quoted in the MessageFormat syntax because it starts with a digit but does not match the number-literal production in the ABNF.

.local $correct = {$someDate :datetime year=|2-digit|}
.local $syntaxError = {$someDate :datetime year=2-digit}

The function :datetime has the following options:

  • weekday
    • long
    • short
    • narrow
  • era
    • long
    • short
    • narrow
  • year
    • numeric
    • 2-digit
  • month
    • numeric
    • 2-digit
    • long
    • short
    • narrow
  • day
    • numeric
    • 2-digit
  • hour
    • numeric
    • 2-digit
  • minute
    • numeric
    • 2-digit
  • second
    • numeric
    • 2-digit
  • fractionalSecondDigits
    • 1
    • 2
    • 3
  • hourCycle (default is locale-specific)
    • h11
    • h12
    • h23
    • h24
  • timeZoneName
    • long
    • short
    • shortOffset
    • longOffset
    • shortGeneric
    • longGeneric

Note

The following options do not have default values because they are only to be used as overrides for locale-and-value dependent implementation-defined defaults.

The following date/time options are not part of the default registry. Implementations SHOULD avoid creating options that conflict with these, but are encouraged to track development of these options during Tech Preview:

Composition

When an operand or an option value uses a variable annotated, directly or indirectly, by a :datetime annotation, its resolved value contains an implementation-defined date/time value of the operand of the annotated expression, together with the resolved options values.

The :date function

The function :date is used to format the date portion of date/time values.

If no options are specified, this function defaults to the following:

  • {$d :date} is the same as {$d :date style=medium}

Operands

The operand of the :date function is either an implementation-defined date/time type or a date/time literal value, as defined in Date and Time Operand. All other operand values produce a Bad Operand error.

Options

The function :date has these options:

  • style
    • full
    • long
    • medium (default)
    • short

If the operand of the expression is an implementation-defined date/time type, it can include other option values. Any operand option values matching the :datetime style options or field options are ignored, as is any style option.

Composition

When an operand or an option value uses a variable annotated, directly or indirectly, by a :date annotation, its resolved value is implementation-defined. An implementation MAY emit a Bad Operand or Bad Option error (as appropriate) when this happens.

The :time function

The function :time is used to format the time portion of date/time values.

If no options are specified, this function defaults to the following:

  • {$t :time} is the same as {$t :time style=short}

Operands

The operand of the :time function is either an implementation-defined date/time type or a date/time literal value, as defined in Date and Time Operand. All other operand values produce a Bad Operand error.

Options

The function :time has these options:

  • style
    • full
    • long
    • medium
    • short (default)

If the operand of the expression is an implementation-defined date/time type, it can include other option values. Any operand option values matching the :datetime style options or field options are ignored, as is any style option.

Composition

When an operand or an option value uses a variable annotated, directly or indirectly, by a :time annotation, its resolved value is implementation-defined. An implementation MAY emit a Bad Operand or Bad Option error (as appropriate) when this happens.

Date and Time Operands

The operand of a date/time function is either an implementation-defined date/time type or a date/time literal value, as defined below. All other operand values produce a Bad Operand error.

A date/time literal value is a non-empty string consisting of an ISO 8601 date, or an ISO 8601 datetime optionally followed by a timezone offset. As implementations differ slightly in their parsing of such strings, ISO 8601 date and datetime values not matching the following regular expression MAY also be supported. Furthermore, matching this regular expression does not guarantee validity, given the variable number of days in each month.

(?!0000)[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])(T([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9](\.[0-9]{1,3})?(Z|[+-]((0[0-9]|1[0-3]):[0-5][0-9]|14:00))?)?

When the time is not present, implementations SHOULD use 00:00:00 as the time. When the offset is not present, implementations SHOULD use a floating time type (such as Java's java.time.LocalDateTime) to represent the time value. For more information, see Working with Timezones.

Important

The ABNF and syntax of MF2 do not formally define date/time literals. This means that a message can be syntactically valid but produce a Bad Operand error at runtime.

Note

String values passed as variables in the formatting context's input mapping can be formatted as date/time values as long as their contents are date/time literals.

For example, if the value of the variable now were the string 2024-02-06T16:40:00Z, it would behave identically to the local variable in this example:

.local $example = {|2024-02-06T16:40:00Z| :datetime}
{{{$now :datetime} == {$example}}}

Note

True time zone support in serializations is expected to coincide with the adoption of Temporal in JavaScript. The form of these serializations is known and is a de facto standard. Support for these extensions is expected to be required in the post-tech preview. See: https://datatracker.ietf.org/doc/draft-ietf-sedate-datetime-extended/