Skip to content

Commit

Permalink
multi: Rework utxoset/view to use outpoints.
Browse files Browse the repository at this point in the history
This modifies the utxoset in the database and related UtxoViewpoint to
store and work with unspent transaction outputs on a per-output basis
instead of at a transaction level.

The primary motivation is to simplify the code, pave the way for a
utxo cache, and generally focus on optimizing runtime performance.

The tradeoff is that this approach does somewhat increase the size of
the serialized utxoset since it means that the transaction hash is
duplicated for each output as a part of the key and some additional
details are duplicated in each output.  The details duplicated in each
output include flags encoded into a single byte that specify whether the
containing transaction is a coinbase, whether the containing transaction
has an expiry, and the transaction type.  Additionally, the containing
block height and index are stored in each output.

However, in practice, the size difference isn't all that large, disk
space is relatively cheap, certainly cheaper than memory, and it is much
more important to provide more efficient runtime operation since that is
the ultimate purpose of the daemon.

While performing this conversion, it also simplifies the code to remove
the transaction version information from the utxoset as well as the
spend journal.  The logic for only serializing it under certain
circumstances is complicated, and it was only used for the gettxout RPC,
where it has already been removed.

The utxos in the database are automatically migrated to the new format
with this commit and it is possible to interrupt and resume the
migration process.

Finally, it also updates all references and tests that previously dealt
with transaction hashes to use outpoints instead.

An overview of the changes are as follows:

- Remove transaction version from both spent and unspent output entries
  - Update utxo serialization format to exclude the version
  - Modify the spend journal serialization format
    - The old version field is now reserved and always stores zero and
      ignores it when reading
    - This allows old entries to be used by new code without having to
      migrate the entire spend journal
- Convert UtxoEntry to represent a specific utxo instead of a
  transaction with all remaining utxos
  - Optimize for memory usage with an eye toward a utxo cache
    - Combine fields such as whether the containing transaction is a
      coinbase, whether the containing transaction has an expiry, and
      the transaction type into a single byte
    - Align entry fields to eliminate extra padding since ultimately
      there will be a lot of these in memory
    - Introduce a free list for serializing an outpoint to the database
      key format to significantly reduce pressure on the GC
  - Update entries to be keyed by a <hash><output index><tree> outpoint
    rather than just a tx hash
  - Update all related functions that previously dealt with transaction
    hashes to accept outpoints instead
  - Update all callers accordingly
  - Only add individually requested outputs from the mempool when
    constructing a mempool view
- Modify the spend journal to always store the encoded flags with every
  spent txout
  - Introduce code to handle fetching the missing information from
    another utxo from the same transaction in the event an old style
    entry is encountered
    - Make use of a database cursor with seek to do this much more
      efficiently than testing every possible output
  - Combine fields such as whether the containing transaction is a
    coinbase, whether the containing transaction has an expiry, and the
    transaction type into a single byte
    - Use 4 bits instead of 3 for the transaction type to be consistent
      with utxos. The extra bit was already unused so this doesn't take
      any additional space
  - Repurpose fully spent flag to represent output spent
    - The spent flag will always be set moving forward, but it is kept
      to identify legacy spend journal entries (if it is not set)
- Introduce ticketMinOuts in place of stakeExtra
  - Renamed stakeExtra as ticketMinOuts and updated all comments to make
    the purpose of the field clearer
  - Only store ticketMinOuts for ticket outputs
  - Add TicketMinimalOutputs function on UtxoEntry in place of
    ConvertUtxosToMinimalOutputs
- Always decompress data loaded from the database now that a utxo entry
  only consists of a specific output
- Introduce upgrade code to migrate the utxo set to the new format
  - Update current database version to 9
  - Update current utxo set version to 3
- Update all tests to expect the correct encodings, remove tests that no
  longer apply, and add new ones for the new expected behavior
  - Convert old tests for the legacy utxo format deserialization code to
    test the new function that is used during upgrade
- Introduce a few new functions on UtxoViewpoint
  - AddTxOut for adding an individual txout versus all of them
  - addTxOut to handle the common code between the new AddTxOut and
    existing AddTxOuts
  - RemoveEntry for removing an individual txout
  - fetchEntryByHash for fetching any remaining utxo for a given
    transaction hash
- Remove the ErrDiscordantTxTree error
  - Since utxos are now retrieved using an outpoint, which includes the
    tree, it is no longer possible to hit this error path
  • Loading branch information
rstaudt2 committed Jan 8, 2021
1 parent 0e72a3e commit d83712d
Show file tree
Hide file tree
Showing 27 changed files with 2,345 additions and 2,206 deletions.
793 changes: 360 additions & 433 deletions blockchain/chainio.go

Large diffs are not rendered by default.

1,140 changes: 425 additions & 715 deletions blockchain/chainio_test.go

Large diffs are not rendered by default.

158 changes: 64 additions & 94 deletions blockchain/compress.go
Original file line number Diff line number Diff line change
Expand Up @@ -579,60 +579,35 @@ func decompressTxOutAmount(amount uint64) uint64 {
// -----------------------------------------------------------------------------

// compressedTxOutSize returns the number of bytes the passed transaction output
// fields would take when encoded with the format described above. The
// preCompressed flag indicates the provided amount and script are already
// compressed. This is useful since loaded utxo entries are not decompressed
// until the output is accessed.
// fields would take when encoded with the format described above.
func compressedTxOutSize(amount uint64, scriptVersion uint16, pkScript []byte,
compressionVersion uint32, preCompressed bool, hasAmount bool) int {
compressionVersion uint32, hasAmount bool) int {

scriptVersionSize := serializeSizeVLQ(uint64(scriptVersion))
if preCompressed && !hasAmount {
return scriptVersionSize + len(pkScript)
}
if preCompressed && hasAmount {
return scriptVersionSize + serializeSizeVLQ(compressTxOutAmount(amount)) +
len(pkScript)
}
if !preCompressed && !hasAmount {
if !hasAmount {
return scriptVersionSize + compressedScriptSize(scriptVersion,
pkScript, compressionVersion)
}

// if !preCompressed && hasAmount
return scriptVersionSize + serializeSizeVLQ(compressTxOutAmount(amount)) +
compressedScriptSize(scriptVersion, pkScript, compressionVersion)
}

// putCompressedTxOut potentially compresses the passed amount and script
// according to their domain specific compression algorithms and encodes them
// directly into the passed target byte slice with the format described above.
// The preCompressed flag indicates the provided amount and script are already
// compressed in which case the values are not modified. This is useful since
// loaded utxo entries are not decompressed until the output is accessed. The
// target byte slice must be at least large enough to handle the number of bytes
// returned by the compressedTxOutSize function or it will panic.
// putCompressedTxOut compresses the passed amount and script according to their
// domain specific compression algorithms and encodes them directly into the
// passed target byte slice with the format described above. The target byte
// slice must be at least large enough to handle the number of bytes returned by
// the compressedTxOutSize function or it will panic.
func putCompressedTxOut(target []byte, amount uint64, scriptVersion uint16,
pkScript []byte, compressionVersion uint32, preCompressed bool,
hasAmount bool) int {
if preCompressed && hasAmount {
offset := putVLQ(target, compressTxOutAmount(amount))
offset += putVLQ(target[offset:], uint64(scriptVersion))
copy(target[offset:], pkScript)
return offset + len(pkScript)
}
if preCompressed && !hasAmount {
offset := putVLQ(target, uint64(scriptVersion))
copy(target[offset:], pkScript)
return offset + len(pkScript)
}
if !preCompressed && !hasAmount {
pkScript []byte, compressionVersion uint32, hasAmount bool) int {

if !hasAmount {
offset := putVLQ(target, uint64(scriptVersion))
offset += putCompressedScript(target[offset:], scriptVersion, pkScript,
compressionVersion)
return offset
}

// if !preCompressed && hasAmount
offset := putVLQ(target, compressTxOutAmount(amount))
offset += putVLQ(target[offset:], uint64(scriptVersion))
offset += putCompressedScript(target[offset:], scriptVersion, pkScript,
Expand All @@ -641,10 +616,11 @@ func putCompressedTxOut(target []byte, amount uint64, scriptVersion uint16,
}

// decodeCompressedTxOut decodes the passed compressed txout, possibly followed
// by other data, into its compressed amount and compressed script and returns
// them along with the number of bytes they occupied.
// by other data, into its uncompressed amount and script and returns them along
// with the number of bytes they occupied prior to decompression.
func decodeCompressedTxOut(serialized []byte, compressionVersion uint32,
hasAmount bool) (int64, uint16, []byte, int, error) {

var amount int64
var bytesRead int
var offset int
Expand Down Expand Up @@ -684,85 +660,79 @@ func decodeCompressedTxOut(serialized []byte, compressionVersion uint32,
scriptSize))
}

// Make a copy of the compressed script so the original serialized data
// can be released as soon as possible.
compressedScript := make([]byte, scriptSize)
copy(compressedScript, serialized[offset:offset+scriptSize])
// Decompress the script.
script := decompressScript(serialized[offset:offset+scriptSize],
compressionVersion)

return amount, uint16(scriptVersion), compressedScript,
offset + scriptSize, nil
return amount, uint16(scriptVersion), script, offset + scriptSize, nil
}

// -----------------------------------------------------------------------------
// Decred specific transaction encoding flags
// txOutFlags defines additional information and state for transaction outputs.
// This is used when serializing both unspent and spent transaction outputs.
//
// Details about a transaction needed to determine how it may be spent
// according to consensus rules are given by these flags.
// The bit representation is:
// bit 0 - containing transaction is a coinbase
// bit 1 - containing transaction has an expiry
// bits 2-5 - transaction type
// bit 6 - output is spent
// bit 7 - unused
//
// The following details are encoded into a single byte, where the index
// of the bit is given in zeroeth order:
// 0: Is coinbase
// 1: Has an expiry
// 2-4: Transaction type
// 5: Unused
// 6: Fully spent
// 7: Unused
//
// 0, 1, and 6 are bit flags, while the transaction type is encoded with a bitmask
// and used to describe the underlying int.
//
// The fully spent flag should always come as the *last* flag (highest bit index)
// in this data type should flags be updated to include more rules in the future,
// such as rules governing new script OP codes. This ensures that we may still use
// these flags in the UTX serialized data without consequence, where the last
// flag indicating fully spent will always be zeroed. Note that currently the
// fully spent flag is stored in bit 6 so that when serializing the flags as a
// VLQ integer it still fits into a single byte.
//
// -----------------------------------------------------------------------------
// The spent flag should always come as the *last* flag (highest bit index) in
// this data type should flags be updated to include more rules in the future,
// such as rules governing new script OP codes. This ensures that we may still
// use these flags in the UTX serialized data without consequence, where the
// last flag indicating spent will always be zeroed. Note that currently the
// spent flag is stored in bit 6 so that when serializing the flags as a VLQ
// integer it still fits into a single byte.
type txOutFlags uint8

const (
// txTypeBitmask describes the bitmask that yields the 3rd, 4th and 5th
// bits from the flags byte.
txTypeBitmask = 0x1c
// txOutFlagCoinBase indicates that a txout was contained in a coinbase tx.
txOutFlagCoinBase = 1 << 0

// txOutFlagHasExpiry indicates that a txout was contained in a tx that included
// an expiry.
txOutFlagHasExpiry = 1 << 1

// txTypeShift is the number of bits to shift flags to the right to yield the
// correct integer value after applying the bitmask with AND.
txTypeShift = 2
// txOutFlagSpent indicates that the output is spent.
txOutFlagSpent = 1 << 6

// txOutFlagTxTypeBitmask describes the bitmask that yields bits 4-7 from
// txoFlags.
txOutFlagTxTypeBitmask = 0x3c

// txOutFlagTxTypeShift is the number of bits to shift txoFlags to the right
// to yield the correct integer value after applying the bitmask with AND.
txOutFlagTxTypeShift = 2
)

// encodeFlags encodes transaction flags into a single byte.
func encodeFlags(isCoinBase bool, hasExpiry bool, txType stake.TxType, fullySpent bool) byte {
b := uint8(txType)
b <<= txTypeShift
func encodeFlags(isCoinBase bool, hasExpiry bool, txType stake.TxType, spent bool) txOutFlags {
b := txOutFlags(txType)
b <<= txOutFlagTxTypeShift

if isCoinBase {
b |= 0x01 // Set bit 0
b |= txOutFlagCoinBase
}
if hasExpiry {
b |= 0x02 // Set bit 1
b |= txOutFlagHasExpiry
}
if fullySpent {
b |= 0x40 // Set bit 6
if spent {
b |= txOutFlagSpent
}

return b
}

// decodeFlags decodes transaction flags from a single byte into their respective
// data types.
func decodeFlags(b byte) (bool, bool, stake.TxType, bool) {
isCoinBase := b&0x01 != 0
hasExpiry := b&(1<<1) != 0
fullySpent := b&(1<<6) != 0
txType := stake.TxType((b & txTypeBitmask) >> txTypeShift)

return isCoinBase, hasExpiry, txType, fullySpent
}
func decodeFlags(flags txOutFlags) (bool, bool, stake.TxType, bool) {
isCoinBase := flags&txOutFlagCoinBase == txOutFlagCoinBase
hasExpiry := flags&txOutFlagHasExpiry == txOutFlagHasExpiry
txType := (flags & txOutFlagTxTypeBitmask) >> txOutFlagTxTypeShift
spent := flags&txOutFlagSpent == txOutFlagSpent

// decodeFlagsFullySpent decodes whether or not a transaction was fully spent.
func decodeFlagsFullySpent(b byte) bool {
return b&(1<<6) != 0
return isCoinBase, hasExpiry, stake.TxType(txType), spent
}

// absInt64 computes the absolute value of the given int64 and converts it into
Expand Down
59 changes: 5 additions & 54 deletions blockchain/compress_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -368,100 +368,66 @@ func TestCompressedTxOut(t *testing.T) {
amount uint64
scriptVersion uint16
pkScript []byte
compPkScript []byte
version uint32
compressed []byte
hasAmount bool
isCompressed bool
}{
{
name: "nulldata with 0 DCR",
amount: 0,
scriptVersion: 0,
pkScript: hexToBytes("6a200102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20"),
compPkScript: hexToBytes("626a200102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20"),
version: 1,
compressed: hexToBytes("00626a200102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20"),
hasAmount: false,
isCompressed: false,
},
{
name: "pay-to-pubkey-hash dust, no amount",
amount: 0,
scriptVersion: 0,
pkScript: hexToBytes("76a9141018853670f9f3b0582c5b9ee8ce93764ac32b9388ac"),
compPkScript: hexToBytes("001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
version: 1,
compressed: hexToBytes("00001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
hasAmount: false,
isCompressed: false,
},
{
name: "pay-to-pubkey-hash dust, no amount, precompressed",
amount: 0,
scriptVersion: 0,
pkScript: hexToBytes("001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
compPkScript: hexToBytes("001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
version: 1,
compressed: hexToBytes("00001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
hasAmount: false,
isCompressed: true,
},
{
name: "pay-to-pubkey-hash dust, amount",
amount: 546,
scriptVersion: 0,
pkScript: hexToBytes("76a9141018853670f9f3b0582c5b9ee8ce93764ac32b9388ac"),
compPkScript: hexToBytes("001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
version: 1,
compressed: hexToBytes("a52f00001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
hasAmount: true,
isCompressed: false,
},
{
name: "pay-to-pubkey-hash dust, amount, precompressed",
amount: 546,
scriptVersion: 0,
pkScript: hexToBytes("001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
compPkScript: hexToBytes("001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
version: 1,
compressed: hexToBytes("a52f00001018853670f9f3b0582c5b9ee8ce93764ac32b93"),
hasAmount: true,
isCompressed: true,
},
{
name: "pay-to-pubkey uncompressed, no amount",
amount: 0,
scriptVersion: 0,
pkScript: hexToBytes("4104192d74d0cb94344c9569c2e77901573d8d7903c3ebec3a957724895dca52c6b40d45264838c0bd96852662ce6a847b197376830160c6d2eb5e6a4c44d33f453eac"),
compPkScript: hexToBytes("04192d74d0cb94344c9569c2e77901573d8d7903c3ebec3a957724895dca52c6b4"),
version: 1,
compressed: hexToBytes("0004192d74d0cb94344c9569c2e77901573d8d7903c3ebec3a957724895dca52c6b4"),
hasAmount: false,
isCompressed: false,
},
{
name: "pay-to-pubkey uncompressed 1 DCR, amount present",
amount: 100000000,
scriptVersion: 0,
pkScript: hexToBytes("4104192d74d0cb94344c9569c2e77901573d8d7903c3ebec3a957724895dca52c6b40d45264838c0bd96852662ce6a847b197376830160c6d2eb5e6a4c44d33f453eac"),
compPkScript: hexToBytes("04192d74d0cb94344c9569c2e77901573d8d7903c3ebec3a957724895dca52c6b4"),
version: 1,
compressed: hexToBytes("090004192d74d0cb94344c9569c2e77901573d8d7903c3ebec3a957724895dca52c6b4"),
hasAmount: true,
isCompressed: false,
},
}

for _, test := range tests {
targetSz := compressedTxOutSize(0, test.scriptVersion, test.pkScript, currentCompressionVersion, test.isCompressed, test.hasAmount) - 1
targetSz := compressedTxOutSize(0, test.scriptVersion, test.pkScript, currentCompressionVersion, test.hasAmount) - 1
target := make([]byte, targetSz)
putCompressedScript(target, test.scriptVersion, test.pkScript, currentCompressionVersion)

// Ensure the function to calculate the serialized size without
// actually serializing the txout is calculated properly.
gotSize := compressedTxOutSize(test.amount, test.scriptVersion,
test.pkScript, test.version, test.isCompressed, test.hasAmount)
test.pkScript, test.version, test.hasAmount)
if gotSize != len(test.compressed) {
t.Errorf("compressedTxOutSize (%s): did not get "+
"expected size - got %d, want %d", test.name,
Expand All @@ -473,7 +439,7 @@ func TestCompressedTxOut(t *testing.T) {
gotCompressed := make([]byte, gotSize)
gotBytesWritten := putCompressedTxOut(gotCompressed,
test.amount, test.scriptVersion, test.pkScript,
test.version, test.isCompressed, test.hasAmount)
test.version, test.hasAmount)
if !bytes.Equal(gotCompressed, test.compressed) {
t.Errorf("compressTxOut (%s): did not get expected "+
"bytes - got %x, want %x", test.name,
Expand Down Expand Up @@ -510,10 +476,10 @@ func TestCompressedTxOut(t *testing.T) {
test.name, gotScrVersion, test.scriptVersion)
continue
}
if !bytes.Equal(gotScript, test.compPkScript) {
if !bytes.Equal(gotScript, test.pkScript) {
t.Errorf("decodeCompressedTxOut (%s): did not get "+
"expected script - got %x, want %x",
test.name, gotScript, test.compPkScript)
test.name, gotScript, test.pkScript)
continue
}
if gotBytesRead != len(test.compressed) {
Expand All @@ -522,21 +488,6 @@ func TestCompressedTxOut(t *testing.T) {
test.name, gotBytesRead, len(test.compressed))
continue
}

// Ensure the compressed values decompress to the expected
// txout.
gotScript = decompressScript(gotScript, test.version)
localScript := make([]byte, len(test.pkScript))
copy(localScript, test.pkScript)
if test.isCompressed {
localScript = decompressScript(localScript, test.version)
}
if !bytes.Equal(gotScript, localScript) {
t.Errorf("decompressTxOut (%s): did not get expected "+
"script - got %x, want %x", test.name,
gotScript, test.pkScript)
continue
}
}
}

Expand Down
5 changes: 0 additions & 5 deletions blockchain/error.go
Original file line number Diff line number Diff line change
Expand Up @@ -392,11 +392,6 @@ const (
// more than it is allowed.
ErrBadStakebaseValue = ErrorKind("ErrBadStakebaseValue")

// ErrDiscordantTxTree specifies that a given origin tx's content
// indicated that it should exist in a different tx tree than the
// one given in the TxIn outpoint.
ErrDiscordantTxTree = ErrorKind("ErrDiscordantTxTree")

// ErrStakeFees indicates an error with the fees found in the stake
// transaction tree.
ErrStakeFees = ErrorKind("ErrStakeFees")
Expand Down
1 change: 0 additions & 1 deletion blockchain/error_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,6 @@ func TestErrorKindStringer(t *testing.T) {
{ErrForceReorgWrongChain, "ErrForceReorgWrongChain"},
{ErrForceReorgMissingChild, "ErrForceReorgMissingChild"},
{ErrBadStakebaseValue, "ErrBadStakebaseValue"},
{ErrDiscordantTxTree, "ErrDiscordantTxTree"},
{ErrStakeFees, "ErrStakeFees"},
{ErrNoStakeTx, "ErrNoStakeTx"},
{ErrBadBlockHeight, "ErrBadBlockHeight"},
Expand Down
2 changes: 1 addition & 1 deletion blockchain/fullblocktests/generate.go
Original file line number Diff line number Diff line change
Expand Up @@ -2014,7 +2014,7 @@ func Generate(includeLargeReorg bool) (tests [][]TestInstance, err error) {
tx.TxIn[0].PreviousOutPoint.Tree = wire.TxTreeStake
b.AddTransaction(tx)
})
rejected(blockchain.ErrDiscordantTxTree)
rejected(blockchain.ErrMissingTxOut)

// Create block with no dev subsidy for coinbase transaction.
//
Expand Down
Loading

0 comments on commit d83712d

Please sign in to comment.