From cb726065762ec26372eb31ccb0f76380a7c9f0f6 Mon Sep 17 00:00:00 2001 From: Ryan Staudt Date: Fri, 18 Dec 2020 16:08:43 -0600 Subject: [PATCH] multi: Rework utxoset/view to use outpoints. This modifies the utxoset in the database and related UtxoViewpoint to store and work with unspent transaction outputs on a per-output basis instead of at a transaction level. The primary motivation is to simplify the code, pave the way for a utxo cache, and generally focus on optimizing runtime performance. The tradeoff is that this approach does somewhat increase the size of the serialized utxoset since it means that the transaction hash is duplicated for each output as a part of the key and some additional details are duplicated in each output. The details duplicated in each output include flags encoded into a single byte that specify whether the containing transaction is a coinbase, whether the containing transaction has an expiry, and the transaction type. Additionally, the containing block height and index are stored in each output. However, in practice, the size difference isn't all that large, disk space is relatively cheap, certainly cheaper than memory, and it is much more important to provide more efficient runtime operation since that is the ultimate purpose of the daemon. While performing this conversion, it also simplifies the code to remove the transaction version information from the utxoset as well as the spend journal. The logic for only serializing it under certain circumstances is complicated, and it was only used for the gettxout RPC, where it has already been removed. The utxos in the database are automatically migrated to the new format with this commit and it is possible to interrupt and resume the migration process. Finally, it also updates all references and tests that previously dealt with transaction hashes to use outpoints instead. An overview of the changes are as follows: - Remove transaction version from both spent and unspent output entries - Update utxo serialization format to exclude the version - Modify the spend journal serialization format - The old version field is now reserved and always stores zero and ignores it when reading - This allows old entries to be used by new code without having to migrate the entire spend journal - Convert UtxoEntry to represent a specific utxo instead of a transaction with all remaining utxos - Optimize for memory usage with an eye toward a utxo cache - Combine fields such as whether the containing transaction is a coinbase, whether the containing transaction has an expiry, and the transaction type into a single byte - Align entry fields to eliminate extra padding since ultimately there will be a lot of these in memory - Introduce a free list for serializing an outpoint to the database key format to significantly reduce pressure on the GC - Update entries to be keyed by a outpoint rather than just a tx hash - Update all related functions that previously dealt with transaction hashes to accept outpoints instead - Update all callers accordingly - Only add individually requested outputs from the mempool when constructing a mempool view - Modify the spend journal to always store the encoded flags with every spent txout - Introduce code to handle fetching the missing information from another utxo from the same transaction in the event an old style entry is encountered - Make use of a database cursor with seek to do this much more efficiently than testing every possible output - Combine fields such as whether the containing transaction is a coinbase, whether the containing transaction has an expiry, and the transaction type into a single byte - Use 4 bits instead of 3 for the transaction type to be consistent with utxos. The extra bit was already unused so this doesn't take any additional space - Repurpose fully spent flag to represent output spent - The spent flag will always be set moving forward, but it is kept to identify legacy spend journal entries (if it is not set) - Introduce ticketMinOuts in place of stakeExtra - Renamed stakeExtra as ticketMinOuts and updated all comments to make the purpose of the field clearer - Only store ticketMinOuts for ticket submission outputs - Add TicketMinimalOutputs function on UtxoEntry in place of ConvertUtxosToMinimalOutputs - Always decompress data loaded from the database now that a utxo entry only consists of a specific output - Introduce upgrade code to migrate the utxo set to the new format - Update current database version to 9 - Update current utxo set version to 3 - Update all tests to expect the correct encodings, remove tests that no longer apply, and add new ones for the new expected behavior - Convert old tests for the legacy utxo format deserialization code to test the new function that is used during upgrade - Introduce a few new functions on UtxoViewpoint - AddTxOut for adding an individual txout versus all of them - addTxOut to handle the common code between the new AddTxOut and existing AddTxOuts - RemoveEntry for removing an individual txout - fetchEntryByHash for fetching any remaining utxo for a given transaction hash - Remove the ErrDiscordantTxTree error - Since utxos are now retrieved using an outpoint, which includes the tree, it is no longer possible to hit this error path --- blockchain/chainio.go | 812 ++++++------ blockchain/chainio_test.go | 1164 +++++++----------- blockchain/compress.go | 158 +-- blockchain/compress_test.go | 59 +- blockchain/error.go | 5 - blockchain/error_test.go | 1 - blockchain/fullblocktests/generate.go | 2 +- blockchain/scriptval.go | 40 +- blockchain/sequencelock.go | 2 +- blockchain/stakeext.go | 15 +- blockchain/upgrade.go | 404 ++++++ blockchain/utxoviewpoint.go | 924 ++++++++------ blockchain/utxoviewpoint_test.go | 4 +- blockchain/validate.go | 162 ++- gcs/blockcf2/blockcf.go | 4 +- internal/mempool/mempool.go | 81 +- internal/mempool/mempool_test.go | 23 +- internal/mempool/policy.go | 7 +- internal/mining/mining.go | 49 +- internal/mining/mining_harness_test.go | 64 +- internal/mining/mining_view_test.go | 3 +- internal/netsync/manager.go | 155 ++- internal/rpcserver/interface.go | 75 +- internal/rpcserver/rpcserver.go | 43 +- internal/rpcserver/rpcserverhandlers_test.go | 251 ++-- rpcadaptors.go | 24 +- server.go | 11 +- 27 files changed, 2340 insertions(+), 2202 deletions(-) diff --git a/blockchain/chainio.go b/blockchain/chainio.go index 43a5b6a7f2..0c00d13558 100644 --- a/blockchain/chainio.go +++ b/blockchain/chainio.go @@ -12,7 +12,7 @@ import ( "errors" "fmt" "math/big" - "sort" + "sync" "time" "github.com/decred/dcrd/blockchain/stake/v4" @@ -28,7 +28,7 @@ import ( const ( // currentDatabaseVersion indicates what the current database // version is. - currentDatabaseVersion = 8 + currentDatabaseVersion = 9 // currentBlockIndexVersion indicates what the current block index // database version. @@ -78,7 +78,7 @@ var ( // utxoSetBucketName is the name of the db bucket used to house the unspent // transaction output set. - utxoSetBucketName = []byte("utxosetv2") + utxoSetBucketName = []byte("utxosetv3") // blockIndexBucketName is the name of the db bucket used to house the block // index which consists of metadata for all known blocks both in the main @@ -249,15 +249,6 @@ func readDeserializeSizeOfMinimalOutputs(serialized []byte) (int, error) { return offset, nil } -// ConvertUtxosToMinimalOutputs converts the contents of a UTX to a series of -// minimal outputs. It does this so that these can be passed to stake subpackage -// functions, where they will be evaluated for correctness. -func ConvertUtxosToMinimalOutputs(entry *UtxoEntry) []*stake.MinimalOutput { - minOuts, _ := deserializeToMinimalOutputs(entry.stakeExtra) - - return minOuts -} - // ----------------------------------------------------------------------------- // The block index consists of an entry for every known block. It consists of // information such as the block header and information about votes. @@ -478,10 +469,20 @@ func dbMaybeStoreBlock(dbTx database.Tx, block *dcrutil.Block) error { // serialized in the reverse order they are spent because later transactions // are allowed to spend outputs from earlier ones in the same block. // +// The reserved field below used to keep track of the version of the containing +// transaction when the spent txout was the final unspent output of the +// containing transaction. The spent flag (which previously represented that +// the containing transaction was fully spent), is always set now, but the extra +// reserved field is kept to allow for backward compatibility. This was kept +// for backward compatibility, rather than migrating the spend journal, due to +// the fact that there is not a trivial or efficient way to resurrect the spent +// txout flags that were previously only stored when the spent txout was the +// final unspent output of the containing transaction. +// // The serialized format is: // // [