Determine file encoding for initial state #1396

segfault-magnet · 2023-10-04T12:13:45Z

Current development on the regenesis feature has split off the initial state from the chain config. This buys us some flexibility to select an appropriate encoding for the initial state file.

Some points to consider:

Storage -- how much overhead does the encoding have?
Cursor -- how quickly can we continue decoding the file when resuming the regenesis?
Performance -- both for decoding and encoding
Compression -- encoding overhead can be reduced by compression. On the other hand, having a cursor to the last loaded element might get complicated. If compression is used ideally it shouldn't have system dependencies and be a well-known algorithm.
Human readability -- keeping it human readable helps with manipulating the file, and developing tools to analyze it, testing.
Library maturity -- general support for encoding/decoding and also rust-specific support.
Streaming - can you decode/encode on the fly? What is the API like? Do you handle reading/saving the data?

segfault-magnet · 2023-10-10T14:54:11Z

An update:

Wrote a small utility to benchmark our use case.

Our expected data volume is enormous -- a 1TB rocksdb database at the moment of regenesis. Benchmarking for this much data became problematic so I resorted to a mixed approach of benchmarks and linear regression.

Testing was done in memory to remove the storage speed variable.

Some expectations were given for the contract state in that it might reach numbers of 100M entries, which meant that one ContractConfig might take up 6.4GB of memory (or more depending on the decoding overhead) when loaded (64B per entry).

This made us uneasy with regards to keeping the state and balances collections inside ContractConfig.

We agreed that ContractConfig would need some flattening. We considered encoding the state and balances into separate files. We ultimately decided against it because of the overhead when matching the contract state and balances to the contract itself. This is manifested in the need for a 'foreign key' to be saved with each storage or balance entry. If we don't enforce an ordering while encoding then a columnar file layout would be needed so that we may locate all of the contract state and balances when parsing a single contract. We briefly considered using the rocksdb snapshot (or sled or mysql) as the target format itself but didn't pursue the idea further.

An alternative would be introducing a somewhat custom layout to fit our needs exactly. We're currently placing the following requirement on whatever data format is chosen:

First all CoinConfigs are encoded,
Then all MessageConfigs are encoded,
Finally all ContractConfigs are encoded. After each ContractConfig all of its state and balances should follow in any order before another ContractConfig is decoded.

The ContractConfig was broken down into 3 structures:

A ContractState representing one storage slot (a (Bytes32, Bytes32))
A ContractBalance representing the balance of the contract for a particular AssetId (a (Bytes32, u64))
ContractConfig - the original fields minus the above.

An example:

COIN
... (coins)
COIN
MESSAGE
...(messages)
MESSAGE
CONTRACT#1
STATE
STATE
BALANCE
STATE
BALANCE
CONTRACT#2
...
...

Test data

Randomly generated Coins, Messages, and Contracts in the above-described ordering.
e.g. 1k test entries = 333 coins, 333 messages, and 333 contracts (each with 10k state entries and 100 balance entries)

Other formats

Didn't consider any column-oriented formats (such as Parquet) since we don't benefit from it.
Didn't consider any formats that need schema code generation (such as ProtoBuf, CaptnProto, etc)
Didn't consider formats whose rust impl is poorly maintained

Compression

Each test is run with and without compression.

Used native Zstd with minimal compression

Json (serde_json)

Pros:

Mature, popular
Serde compatible
Human readable
Can be streamed (json lines)
Cons:
We don't need the schema, adds encoding overhead

Measurements taken for 10k, 20k, ..., 100k entries. Using linear regression to predict usage for up to 1B entries:

Storage

400GB without compression, 255GB with compression.

Encoding performance

Around 15m for uncompressed json, 1h8m for compressed.

Decoding performance

30 minutes for uncompressed, 43 minutes for compressed json,

Bincode

Pros:

Mature
Serde compatible
No schema
Can be streamed (just encode elements one after another)
Cons:
Not human-readable
Manual streaming -- Collections require size to be encoded before hand

Measurements taken for 10k, 20k, ..., 100k entries. Using linear regression to predict usage for up to 1B entries:

Storage

Around 290GB uncompressed, 240GB compressed

Encoding performance

Around 13 minutes uncompressed, and around 46 minutes compressed.

Decoding performance

27 minutes uncompressed, around 37 compressed

Bson

Pros:

Mature
Serde compatible
Can be streamed
Cons:
Has overhead for schema
Interface doesn't allow for encoding into a writer -- extra allocations
No unsigned types native support, have to be careful around u64 and bigger types since they can't fit into i64 supported by BSON.
Manual streaming -- collections can only be encoded in-memory.

Measurements taken for 10k, 20k, ..., 100k entries. Using linear regression to predict usage for up to 1B entries:

Storage

Around 400GB uncompressed, around 250GB compressed

Encoding performance

Around 15m uncompressed, around 45m compressed

Decoding performance

33 minutes uncompressed, 46 minutes compressed.

Bench summary graph

Compression impact on cursor

Seeking a location on uncompressed data is fast. This is not the case for compressed data since you have to decompress even the data you don't care about. There are workarounds should we need them.

The naive approach takes around 13 minutes to decompress and seek to the end of 400GB of compressed data.

Current summary:
We're going with bincode + optional compression. Starting work on implementing the reading/writing logic.

segfault-magnet · 2023-11-07T13:56:13Z

Tried one more format, Parquet. Even though we don't benefit from the columnar layout parquet has the advantage of encoding data in chunks (solving the cursor + encoding problem).

It is also not Serde compatible and has poor support for deriving the encoding/decoding code.

Also the column for a file should represent one entity (and not multiple like a rust enum might do). So that means 5 files: coins, messages, contracts, contract_state and contract_balance.

Measurements taken for 10k, 20k, ..., 100k entries. Using linear regression to predict usage for up to 1B entries:

Storage

Around 140 GB without compression, more (+5GB) with compression (at the lowest level Gz, just like all the other compressions used)

Encoding performance

Around 300s (uncompressed) and around 1250s compressed.

Decoding performance

480s (uncompressed) and around 720s (compressed).

All compared

Storage

Encoding performance

Decoding performance

It seems Parquet is a clear winner. Will proceed to use it instead of bincode for the regenesis.

xgreenx · 2024-01-31T13:05:58Z

Done during #1474

segfault-magnet added regenesis SDK team The issue is ready to be addressed by SDK team labels Oct 4, 2023

segfault-magnet self-assigned this Oct 4, 2023

xgreenx mentioned this issue Oct 4, 2023

Rework the handling of the ChainConfig.initial_state during re-genesis #1209

Closed

MujkicA mentioned this issue Oct 12, 2023

Regenesis support #1413

Closed

5 tasks

xgreenx closed this as completed Jan 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Determine file encoding for initial state #1396

Determine file encoding for initial state #1396

segfault-magnet commented Oct 4, 2023

segfault-magnet commented Oct 10, 2023 •

edited

Loading

segfault-magnet commented Nov 7, 2023 •

edited

Loading

xgreenx commented Jan 31, 2024

Determine file encoding for initial state #1396

Determine file encoding for initial state #1396

Comments

segfault-magnet commented Oct 4, 2023

segfault-magnet commented Oct 10, 2023 • edited Loading

Test data

Other formats

Compression

Json (serde_json)

Storage

Encoding performance

Decoding performance

Bincode

Storage

Encoding performance

Decoding performance

Bson

Storage

Encoding performance

Decoding performance

Bench summary graph

Compression impact on cursor

segfault-magnet commented Nov 7, 2023 • edited Loading

Storage

Encoding performance

Decoding performance

All compared

Storage

Encoding performance

Decoding performance

xgreenx commented Jan 31, 2024

segfault-magnet commented Oct 10, 2023 •

edited

Loading

segfault-magnet commented Nov 7, 2023 •

edited

Loading