Skip to content

Commit

Permalink
Spec: add variant type
Browse files Browse the repository at this point in the history
  • Loading branch information
sfc-gh-aixu committed Jul 31, 2024
1 parent 506fee4 commit b868ea6
Show file tree
Hide file tree
Showing 3 changed files with 691 additions and 17 deletions.
37 changes: 20 additions & 17 deletions format/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,28 +164,30 @@ A **`list`** is a collection of values with some element type. The element field

A **`map`** is a collection of key-value pairs with a key type and a value type. Both the key field and value field each have an integer id that is unique in the table schema. Map keys are required and map values can be either optional or required. Both map keys and map values may be any type, including nested types.

A **`variant`** is a type to represent semi-structured data. A variant value can store a value of any other type, including any primitive, struct, list or map values. The variant value is encoded in its own binary [encoding](variant-spec.md). Variant type is ddded in [v3](#version-3).

#### Primitive Types

Supported primitive types are defined in the table below. Primitive types added after v1 have an "added by" version that is the first spec version in which the type is allowed. For example, nanosecond-precision timestamps are part of the v3 spec; using v3 types in v1 or v2 tables can break forward compatibility.

| Added by version | Primitive type | Description | Requirements |
|------------------|--------------------|--------------------------------------------------------------------------|--------------------------------------------------|
| | **`boolean`** | True or false | |
| | **`int`** | 32-bit signed integers | Can promote to `long` |
| | **`long`** | 64-bit signed integers | |
| | **`float`** | [32-bit IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) floating point | Can promote to double |
| | **`double`** | [64-bit IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) floating point | |
| | **`decimal(P,S)`** | Fixed-point decimal; precision P, scale S | Scale is fixed [1], precision must be 38 or less |
| | **`date`** | Calendar date without timezone or time | |
| | **`time`** | Time of day without date, timezone | Microsecond precision [2] |
| | **`timestamp`** | Timestamp, microsecond precision, without timezone | [2] |
| | **`timestamptz`** | Timestamp, microsecond precision, with timezone | [2] |
| [v3](#version-3) | **`timestamp_ns`** | Timestamp, nanosecond precision, without timezone | [2] |
| Added by version | Primitive type | Description | Requirements |
|------------------|----------------------|--------------------------------------------------------------------------|--------------------------------------------------|
| | **`boolean`** | True or false | |
| | **`int`** | 32-bit signed integers | Can promote to `long` |
| | **`long`** | 64-bit signed integers | |
| | **`float`** | [32-bit IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) floating point | Can promote to double |
| | **`double`** | [64-bit IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) floating point | |
| | **`decimal(P,S)`** | Fixed-point decimal; precision P, scale S | Scale is fixed [1], precision must be 38 or less |
| | **`date`** | Calendar date without timezone or time | |
| | **`time`** | Time of day without date, timezone | Microsecond precision [2] |
| | **`timestamp`** | Timestamp, microsecond precision, without timezone | [2] |
| | **`timestamptz`** | Timestamp, microsecond precision, with timezone | [2] |
| [v3](#version-3) | **`timestamp_ns`** | Timestamp, nanosecond precision, without timezone | [2] |
| [v3](#version-3) | **`timestamptz_ns`** | Timestamp, nanosecond precision, with timezone | [2] |
| | **`string`** | Arbitrary-length character sequences | Encoded with UTF-8 [3] |
| | **`uuid`** | Universally unique identifiers | Should use 16-byte fixed |
| | **`fixed(L)`** | Fixed-length byte array of length L | |
| | **`binary`** | Arbitrary-length byte array | |
| | **`string`** | Arbitrary-length character sequences | Encoded with UTF-8 [3] |
| | **`uuid`** | Universally unique identifiers | Should use 16-byte fixed |
| | **`fixed(L)`** | Fixed-length byte array of length L | |
| | **`binary`** | Arbitrary-length byte array | |

Notes:

Expand Down Expand Up @@ -1288,6 +1290,7 @@ This serialization scheme is for storing single values as individual binary valu
## Appendix E: Format version changes

### Version 3
New type `variant` is added in V3.

Default values are added to struct fields in v3.
* The `write-default` is a forward-compatible change because it is only used at write time. Old writers will fail because the field is missing.
Expand Down
Loading

0 comments on commit b868ea6

Please sign in to comment.