Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database growth rate is too high #1790

Open
npelov opened this issue Sep 26, 2024 · 5 comments
Open

Database growth rate is too high #1790

npelov opened this issue Sep 26, 2024 · 5 comments

Comments

@npelov
Copy link

npelov commented Sep 26, 2024

The database size grows too fast. With that speed database won't be able to fit in 1 TB SSD after 5 years. I saw a few optimizations done with the updates, but they reduce the size just a bit and within 1-2 epochs it catches up. Also the updates that need db migration adds one more problem. Let's say the database grows to 1 TB you'll need 2 TB to migrate when format is changed for some reason.

Optimizing storage format is only a temporary solution. The problem is growth speed. Because you are limited by the cpu how big the plot can be you must have multiple plots and each one will need 2 TB ssd for database.

Is there any plan how to fix this in the future?

@pigmej
Copy link
Member

pigmej commented Sep 26, 2024

There are a lot of items moving forward there. The major component responsible for the growth is the amount of ATX.

  • ATX merge -> will lower ballots growth rate
  • POST merge -> will ultimately solve the amount of ATX
  • activation service which could be seen there: Node Split pm#334

The current migration from 1.6 to 1.7 is a bit special because it had to migrate the whole DB. We don't expect soon to have a similar need.

@npelov
Copy link
Author

npelov commented Sep 27, 2024

If I recall correctly there was a db migration not long ago - at 1.5.
Are you storing all data as binary - like hashes for example. Is it binary or hex? If not - that would give you more time to solve the growth rate. I'm not aware of what type of data is stored, but I'm guessing there are lots of hashes.
Btw I noticed that after the update the database got ~10 GB bigger.

@pigmej
Copy link
Member

pigmej commented Sep 27, 2024

Yes db in 1.7 is bigger than 1.6 by about 10- GiB

Are you storing all data as binary - like hashes for example. Is it binary or hex? If not - that would give you more time to solve the growth rate.

It's currently stored in the most optimized format possible. We could add compression with something like zstd BUT that will make CPU requirements higher and would likely introduce latency.

The major problem, as I said, is the amount of ATXes, which then leads to the amount of proposals. ATX merge will result in fewer proposals (by 1/3 likely) post-merge will reduce the number of ATXes.

@npelov
Copy link
Author

npelov commented Sep 27, 2024

If it's zstd (level 3 to 5) cpu won't even notice - the disk IO will take much more time than zstd even if it's NVME. However compression only makes sense if data is big enough. A quick select showed that size is quite small:
select min(length(atx)),max(length(atx)) as size from (select * from atx_blobs limit 2000000);
408|1364

zstd -3 blob_dump.bin
blob_dump.bin : 94.50% ( 1.22 KiB => 1.16 KiB, blob_dump.bin.zst)

that's not enough to justify it. If the data is >50kb and it compresses to 50% or less on average then it might be worth it
But it's pretty random-like data. it won't compress much. I tried compressing 30 blobs concatenated:
zstd -5 dump.bin
dump.bin : 86.49% ( 33.5 KiB => 29.0 KiB, dump.bin.zst)

Still not good enough.

One problem of storing it all into database is that index cost as much space as the column + row id size (8) = 40 bytes
If you don't declare integer primary key the table still has one but it's hidden. So if you declare it for atxs table it will cost you nothing and you can use it for referencing blobs in the atx_blobs table (and remove id from atx_blobs). Depending on how you use these tables that might not be possible - for example. Also if you want to find blob by id you might need to join - you can run benchmark how much more will that cost. If you do that it'll save (32 column + 32+8 index)*NumRows(atx_blobs) = 72 bytes

epoch 1 to 27 has ~55M rows which is ~3.7 GB
that's roughly 5-6 GB up to epoch 31 (estimated, takes too long to count rows) + 500M for each new epoch

It's not much but row id is there weather or not you use it, so you might take advantage of it.

Well good luck! I hope you find some solution soon because it'll get out of hand.

@pigmej
Copy link
Member

pigmej commented Sep 27, 2024

Yes moving hashes to ids is planned in go-spacemesh.

It will make the DB a bit smaller but then it will grow again because of a new epoch etc.

We have checked the zstd module for sqlite too, but sadly it does not help much (https://phiresky.github.io/blog/2022/sqlite-zstd/)

The ultimate solution for that is POST merge, which is where we aim. Doing optimizations, as you mentioned, sadly, are not helping in the long run, as you noticed. POST merge should help tremendously with the ATX count and, therefore, have a bigger effect than any of the micro-optimizations, especially over the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants