-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add(state): Track spending transaction ids by spent outpoints and revealed nullifiers #8895
base: main
Are you sure you want to change the base?
Conversation
zebra-state/src/service/finalized_state/disk_format/upgrade/track_tx_locs_by_spends.rs
Outdated
Show resolved
Hide resolved
Added a |
fd2bcca
to
f6faf42
Compare
@mpguerra It looks like it's actually not using much storage space, I was looking at the db metrics printed at startup which were about double the expected storage requirements prior to the change, but the total size of the state cache directory is about the same as it was before, so I think the db metrics are overestimating the total db size. I checked the number of keys by column family, and by height 2M on Mainnet, it's ~10M transparent outputs and ~150M nullifiers in total, not all of which are spent. It's 10B per spent transparent output and 5 bytes per nullifier, so it should be, at most, ~1GB of additional storage requirements by block 2M. I'll update this comment with the number of nullifiers and transparent outputs at the network chain tip once my node finishes syncing, but it's looking like hiding this behind a feature may have been unnecessary.
Update: At the current network chain tip, it's about 6.2GB of extra data (5.5gb + 140M * 5B), also it's 14B per spent transparent output, not 10B (I had forgot about the output index). 6.2GB doesn't seem excessive, but we could use the feature later if/when caching blocks in their compact format. Relevant Column Family Sizessprout_nullifiers (Disk: 146.7 MB, Memory: 9.4 MB, num_keys: Some(1663236))
sapling_nullifiers (Disk: 230 MB, Memory: 4.2 MB, num_keys: Some(3068534))
orchard_nullifiers (Disk: 6.3 GB, Memory: 55.1 MB, num_keys: Some(134798732))
tx_loc_by_spent_out_loc (Disk: 5.5 GB, Memory: 6.3 MB, num_keys: Some(316786532)) |
Test failure is unrelated, should be fixed now. |
1ed1c45
to
bf37b97
Compare
…a read method and an update to `prepare_spending_transparent_tx_ids_batch()` for maintaining it when committing blocks to the finalized state. Adds TODOs for remaining production changes needed for issue #8837.
…Id` ReadResponse
…cations in db format upgrade
… its type to a `Spend` enum
…aDb instead of DiskDb, checks cancel_receiver before every db operation
…ng transaction ids
…logs for progress updates
…in db format version file
- adds the build metadata to the db version file before adding indexes. - deletes indexes when running without the `indexer` feature
…c when trying to open the db with that column family.
bf37b97
to
1ebb1a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks really good and should be safe to merge as almost everything is behind a feature flag.
The only thing i found is that the acceptance test has been running for more than 30 mins now. I will like to know your experience with it.
Part of the logs:
2024-11-29T13:19:41.580755Z WARN {zebrad="1ebb1a5" net="Main"}: zebrad::components::sync::progress: chain updates have stalled, state height has not increased for 29 minutes. Hint: check your network connection, and your computer clock and time zone sync_percent=99.742% current_height=Height(2726357) network_upgrade=Nu5 time_since_last_state_block=29m target_block_spacing=1m 15s max_block_spacing=None is_syncer_stopped=false
2024-11-29T13:20:01.150022Z INFO {zebrad="1ebb1a5" net="Main"}:peer_cache_updater: zebra_network::config: updated cached peer IP addresses cached_ip_count=46 peer_cache_file="/home/alfredo/.cache/zebra/network/mainnet.peers"
2024-11-29T13:20:19.131087Z INFO {zebrad="1ebb1a5" net="Main"}:crawl_and_dial{new_peer_interval=30s}:crawl{should_always_dial=false}: zebra_network::peer_set::candidate_set: timeout waiting for peer service readiness or peer responses
2024-11-29T13:20:41.593014Z WARN {zebrad="1ebb1a5" net="Main"}: zebrad::components::sync::progress: chain updates have stalled, state height has not increased for 30 minutes. Hint: check your network connection, and your computer clock and time zone sync_percent=99.742% current_height=Height(2726357) network_upgrade=Nu5 time_since_last_state_block=30m target_block_spacing=1m 15s max_block_spacing=None is_syncer_stopped=false
2024-11-29T13:20:41.947396Z INFO {zebrad="1ebb1a5" net="Main"}:sync:try_to_sync: zebrad::components::sync: starting sync, obtaining new tips state_tip=Some(Height(2726357))
2024-11-29T13:21:19.130654Z INFO {zebrad="1ebb1a5" net="Main"}:crawl_and_dial{new_peer_interval=30s}:crawl{should_always_dial=false}: zebra_network::peer_set::candidate_set: timeout waiting for peer service readiness or peer responses
Test i ommented out here ended up failing locally for me:
|
…aded tokio runtime in has_spending_transaction_ids test
The test checks that a prepared finalized state has the indexes. Documented and updated to use a multi-threaded async runtime in f71c897 (I'm not sure how it was working for me before, the
It syncs to the network tip, but it takes ~10 minutes with an up-to-date cached state for me, mostly waiting for the "finished initial sync" log. It should add the indexes within 30 minutes (depending on system resources, but the format upgrade is ~10 minutes for me). It keeps failing with "should have spending transaction hash", I'm not sure why yet. |
Motivation
We want to lookup transaction ids by their transparent inputs and revealed nullifiers.
Closes #8837,
Closes #8838.
Closes #8922.
Solution
tx_loc_by_spent_out_loc
column family()
TransactionLocation
of spending transactions by spentOutputLocation
s and nullifiers when writing blocks to the finalized statespent_utxos
field on non-finalizedChain
sReadRequest
andReadResponse
variants for querying spending tx ids by outpoint with theReadStateService
spending_transaction_hash()
read function used to handle the newReadRequest
It may be possible to update the
tx_loc_by_transparent_addr_loc
column instead, but adding a new one seemed easier.Related Changes:
cancel_receiver
to acrossbeam-channel::mpmc
and parallelizes the db format upgrade by blockTests
PR Author's Checklist
PR Reviewer's Checklist