Database growth rate is too high #1790

npelov · 2024-09-26T16:19:59Z

The database size grows too fast. With that speed database won't be able to fit in 1 TB SSD after 5 years. I saw a few optimizations done with the updates, but they reduce the size just a bit and within 1-2 epochs it catches up. Also the updates that need db migration adds one more problem. Let's say the database grows to 1 TB you'll need 2 TB to migrate when format is changed for some reason.

Optimizing storage format is only a temporary solution. The problem is growth speed. Because you are limited by the cpu how big the plot can be you must have multiple plots and each one will need 2 TB ssd for database.

Is there any plan how to fix this in the future?

pigmej · 2024-09-26T16:41:19Z

There are a lot of items moving forward there. The major component responsible for the growth is the amount of ATX.

ATX merge -> will lower ballots growth rate
POST merge -> will ultimately solve the amount of ATX
activation service which could be seen there: Node Split pm#334

The current migration from 1.6 to 1.7 is a bit special because it had to migrate the whole DB. We don't expect soon to have a similar need.

npelov · 2024-09-27T09:56:05Z

If I recall correctly there was a db migration not long ago - at 1.5.
Are you storing all data as binary - like hashes for example. Is it binary or hex? If not - that would give you more time to solve the growth rate. I'm not aware of what type of data is stored, but I'm guessing there are lots of hashes.
Btw I noticed that after the update the database got ~10 GB bigger.

pigmej · 2024-09-27T11:35:46Z

Yes db in 1.7 is bigger than 1.6 by about 10- GiB

Are you storing all data as binary - like hashes for example. Is it binary or hex? If not - that would give you more time to solve the growth rate.

It's currently stored in the most optimized format possible. We could add compression with something like zstd BUT that will make CPU requirements higher and would likely introduce latency.

The major problem, as I said, is the amount of ATXes, which then leads to the amount of proposals. ATX merge will result in fewer proposals (by 1/3 likely) post-merge will reduce the number of ATXes.

npelov · 2024-09-27T16:18:41Z

If it's zstd (level 3 to 5) cpu won't even notice - the disk IO will take much more time than zstd even if it's NVME. However compression only makes sense if data is big enough. A quick select showed that size is quite small:
select min(length(atx)),max(length(atx)) as size from (select * from atx_blobs limit 2000000);
408|1364

zstd -3 blob_dump.bin
blob_dump.bin : 94.50% ( 1.22 KiB => 1.16 KiB, blob_dump.bin.zst)

that's not enough to justify it. If the data is >50kb and it compresses to 50% or less on average then it might be worth it
But it's pretty random-like data. it won't compress much. I tried compressing 30 blobs concatenated:
zstd -5 dump.bin
dump.bin : 86.49% ( 33.5 KiB => 29.0 KiB, dump.bin.zst)

Still not good enough.

One problem of storing it all into database is that index cost as much space as the column + row id size (8) = 40 bytes
If you don't declare integer primary key the table still has one but it's hidden. So if you declare it for atxs table it will cost you nothing and you can use it for referencing blobs in the atx_blobs table (and remove id from atx_blobs). Depending on how you use these tables that might not be possible - for example. Also if you want to find blob by id you might need to join - you can run benchmark how much more will that cost. If you do that it'll save (32 column + 32+8 index)*NumRows(atx_blobs) = 72 bytes

epoch 1 to 27 has ~55M rows which is ~3.7 GB
that's roughly 5-6 GB up to epoch 31 (estimated, takes too long to count rows) + 500M for each new epoch

It's not much but row id is there weather or not you use it, so you might take advantage of it.

Well good luck! I hope you find some solution soon because it'll get out of hand.

pigmej · 2024-09-27T16:27:34Z

Yes moving hashes to ids is planned in go-spacemesh.

It will make the DB a bit smaller but then it will grow again because of a new epoch etc.

We have checked the zstd module for sqlite too, but sadly it does not help much (https://phiresky.github.io/blog/2022/sqlite-zstd/)

The ultimate solution for that is POST merge, which is where we aim. Doing optimizations, as you mentioned, sadly, are not helping in the long run, as you noticed. POST merge should help tremendously with the ATX count and, therefore, have a bigger effect than any of the micro-optimizations, especially over the time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database growth rate is too high #1790

Database growth rate is too high #1790

npelov commented Sep 26, 2024

pigmej commented Sep 26, 2024

npelov commented Sep 27, 2024 •

edited

Loading

pigmej commented Sep 27, 2024

npelov commented Sep 27, 2024

pigmej commented Sep 27, 2024

Database growth rate is too high #1790

Database growth rate is too high #1790

Comments

npelov commented Sep 26, 2024

pigmej commented Sep 26, 2024

npelov commented Sep 27, 2024 • edited Loading

pigmej commented Sep 27, 2024

npelov commented Sep 27, 2024

pigmej commented Sep 27, 2024

npelov commented Sep 27, 2024 •

edited

Loading