Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process block headers #25

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

samay-kothari
Copy link
Contributor

@samay-kothari samay-kothari commented Jul 16, 2022

Using the functions introduced in #22 to add the functionality where we would be processing headers just after they are downloaded.

@coveralls
Copy link

coveralls commented Jul 16, 2022

Pull Request Test Coverage Report for Build 2981668692

  • 129 of 152 (84.87%) changed or added relevant lines in 5 files are covered.
  • 3 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.1%) to 53.504%

Changes Missing Coverage Covered Lines Changed/Added Lines %
blockchain/accept.go 11 13 84.62%
blockchain/process.go 10 13 76.92%
blockchain/chain.go 0 18 0.0%
Files with Coverage Reduction New Missed Lines %
peer/peer.go 3 75.6%
Totals Coverage Status
Change from base Build 2794687106: 0.1%
Covered Lines: 27192
Relevant Lines: 50822

💛 - Coveralls

@samay-kothari samay-kothari marked this pull request as ready for review August 4, 2022 18:44
@samay-kothari
Copy link
Contributor Author

@kcalvinalvin can you rerun the test, I don't know why the check is failed here, it passed on my forked repository.
https://github.com/samay-kothari/utreexod/actions/runs/2798870092

@kcalvinalvin
Copy link
Contributor

@kcalvinalvin can you rerun the test, I don't know why the check is failed here, it passed on my forked repository.
https://github.com/samay-kothari/utreexod/actions/runs/2798870092

Ok. For future reference, you can force github to re-run actions by changing the commit hash with git commit --amend and force pushing.

@@ -269,7 +269,8 @@ func newBlockIndex(db database.DB, chainParams *chaincfg.Params) *blockIndex {
// This function is safe for concurrent access.
func (bi *blockIndex) HaveBlock(hash *chainhash.Hash) bool {
bi.RLock()
_, hasBlock := bi.index[*hash]
node := bi.LookupNode(hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bi.LookupNode() also tries to acquire the blockindex read lock. It'll work because it's a read lock but it ideally should be avoided. I think it's better to use bi.index[*hash].

@@ -269,7 +269,8 @@ func newBlockIndex(db database.DB, chainParams *chaincfg.Params) *blockIndex {
// This function is safe for concurrent access.
func (bi *blockIndex) HaveBlock(hash *chainhash.Hash) bool {
bi.RLock()
_, hasBlock := bi.index[*hash]
node := bi.LookupNode(hash)
hasBlock := node != nil && node.status.HaveData()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line changes the behavior of the function HaveBlock() the documentation states that it returns whether a block exists in the blockindex. This line also asserts that the blockdata is also downloaded.

HaveBlock is only called in blockchain.blockExists in this repo anyways so why not edit that function instead? HaveBlock is exported so best not to change its behavior unless there's no alternative for backwards compatibility (for potential people using the library).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this makes sense
I will make the required changes to blockExists and test the functionality

@@ -202,7 +202,7 @@ func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bo
log.Tracef("Processing block %v", blockHash)

// The block must not already exist in the main chain or side chains.
exists, err := b.blockExists(blockHash)
exists, err := b.HaveBlock(blockHash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the comment on HaveBlock. We can change the behavior of blockExists instead of changing the behavior of HaveBlock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the behaviour of blockExists and it checks and returns, whether there is block data or not

blockchain/process.go Outdated Show resolved Hide resolved
netsync/manager.go Show resolved Hide resolved

// Processing the block headers that are downloaded and if it passes
// all the checks, creating a block node that only contains header.
err := chain.ProcessBlockHeader(blockHeader)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we call this, then we'd also need to get rid of the checkBlockHeaderContext call in checkBlockContext, which is called from maybeAcceptBlock since it's just duplicate check of the same exact thing.

func (b *BlockChain) checkBlockContext(block *btcutil.Block, prevNode *blockNode, flags BehaviorFlags) error {
// Perform all block header related validation checks.
header := &block.MsgBlock().Header
err := b.checkBlockHeaderContext(header, prevNode, flags)
if err != nil {
return err
}

@samay-kothari
Copy link
Contributor Author

samay-kothari commented Aug 12, 2022

Hi calvin, I was wondering that the usage of blockExists here

// Handle orphan blocks.
prevHash := &blockHeader.PrevBlock
prevHashExists, err := b.blockExists(prevHash)
if err != nil {
return false, false, err
}
, where we are checking if the block is orphan or not.

Ideally we should not add the condition of presence of block data along with presence of node while checking if the parent exists or not because the case might be in future, that the parent's header is verified, but the block data is not downloaded yet, so the children should not be marked orphan, just because of absence of block data.

So I think rather than changing the blockExists function, I can add a check for block data wherever it is required.

@samay-kothari samay-kothari force-pushed the processBlockHeaders branch 2 times, most recently from a8f4252 to 3bb7f4b Compare August 14, 2022 21:39
@samay-kothari
Copy link
Contributor Author

samay-kothari commented Aug 14, 2022

Hi @kcalvinalvin, as discussed in the call, i added the flags that prevents the header validation to happen if already the header is verified. I also removed the changes made to the existing blockExists function, and added the new function that checks if the block is present in data, so existing functioning is not getting affected.

err = b.index.flushToDB()
if err != nil {
return false, err
blockHash := blockHeader.BlockHash()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use blockHash := block.Hash() instead of blockHash := blockHeader.BlockHash().

Reasoning being that block.Hash() will cache the result and will use the cache the next time block.Hash() is called for this block.

Could then move blockHeader := &block.MsgBlock().Header inside the if branch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay will make these changes

// The block must not already exist in the main chain or side chains.
exists, err := b.blockExists(blockHash)
if err != nil {
return false, false, err
}
if exists {
// Looking up for node and checking if it have block data or not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Something like:

Look up node and check if it has the block data is better grammatically

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change have with has. :)

// BFHeaderValidated sets the third bit of the flag and indicates that
// the header has already been validated and the header validation can be
// skipped when block validation is being performed.
BFHeaderValidated BehaviorFlags = 1 << 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we place BFHeaderValidated before BFNone, it'll automatically have that bit set without doing 1 << 2

So something like

        // BFNoPoWCheck may be set to indicate the proof of work check which
	// ensures a block hashes to a value less than the required target will
	// not be performed.
	BFNoPoWCheck

 	// BFHeaderValidated indicates the header has already been validated
        // and the header validation can be skipped when block validation is
        // being performed.
 	BFHeaderValidated BehaviorFlags

 	// BFNone is a convenience value to specifically indicate no flags.
 	BFNone BehaviorFlags = 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I will do that

// So we add this to the behavioural flag and the flag is then passed to further
// functions, where we don't need to reavalidate the header.
if exists {
flags = flags | BFHeaderValidated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: flags |= BFHeaderValidated would be succinct and better.

// If the third bit of the flag is set, then it means that the header
// is already been validated, so we don't need to revalidate it. So we
// check block header sanity, only when the second bit is not set to 1
if (flags >> 2 & 1) != 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flags&BFHeaderValidated != BFHeaderValidated would be easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is more readable and understandable

// If the third bit of the flag is set, then it means that the header
// is already been validated, so we don't need to revalidate it. So we
// check block header context, only when the second bit is not set to 1
if (flags >> 2 & 1) != 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. flags&BFHeaderValidated != BFHeaderValidated would be better.

@@ -200,13 +200,20 @@ func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bo

blockHash := block.Hash()
log.Tracef("Processing block %v", blockHash)

// The block must not already exist in the main chain or side chains.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is now outdated because the node can exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes, does Looking up for node and checking if it has block data or not works?

// status.
node := b.IndexLookupNode(hash)
// If the returned node is nil then the block is stored in the
// database and not present in block index, we return true here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the only case this would happen is if the database is corrupt. There's no case in where a block doesn't exist in the index but exists on disk. Deletion never happens for both the disk and the in-memory index.

Copy link
Contributor

@kcalvinalvin kcalvinalvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also wrap the commit messages at 70?

Ex:

When the block data is being validated for an already validated header, then we modify the behavioural flag, so that we don't repeat the header validation steps again

should be

When the block data is being validated for an already validated
header, then we modify the behavioural flag, so that we don't repeat
the header validation steps again

Also am seeing some grammatical errors here and there and it'd be better to have them grammatically correct.

err = b.index.flushToDB()
if err != nil {
return false, err
// Lookup for the block node that was created at the time of header
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/blockchain/accept.go b/blockchain/accept.go
index 38c4107b..e42486b1 100644
--- a/blockchain/accept.go
+++ b/blockchain/accept.go
@@ -63,7 +63,7 @@ func (b *BlockChain) maybeAcceptBlock(block *btcutil.Block, flags BehaviorFlags)
        }
 
        // Lookup for the block node that was created at the time of header
-       // validation if the block node is not existent then create a new
+       // validation. If the block node is not existent, then create a new
        // block node for the block and add it to the node index. Even if
        // the block ultimately gets connected to the main chain, it starts
        // out on a side chain.

@@ -200,13 +200,20 @@ func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bo

blockHash := block.Hash()
log.Tracef("Processing block %v", blockHash)

// The block must not already exist in the main chain or side chains.
// Checking if node is present in main chain or side chain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/blockchain/process.go b/blockchain/process.go
index 78409100..0f92040d 100644
--- a/blockchain/process.go
+++ b/blockchain/process.go
@@ -200,19 +200,20 @@ func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bo
 
        blockHash := block.Hash()
        log.Tracef("Processing block %v", blockHash)
-       // Checking if node is present in main chain or side chain.
+
+       // Check if the block is present in the main chain or any of the side chains.
        exists, err := b.blockExists(blockHash)
        if err != nil {
                return false, false, err
        }
-       // Looking up for node and checking if it has block data or not
+       // Look up the node and check if the block data is already stored.
        node := b.index.LookupNode(blockHash)
        haveData := false
        if node != nil {
                haveData = node.status.HaveData()
        }
-       // Returning error of duplicate block only if block data is already present
-       // at the node and we recieved duplicated block data.
+
+       // Return an error if the block data is already present.
        if exists && haveData {
                str := fmt.Sprintf("already have block %v", blockHash)
                return false, false, ruleError(ErrDuplicateBlock, str)
@@ -232,7 +233,7 @@ func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bo
                }
        }
 
-       // Perform preliminary sanity checks on the block and its transactions.a
+       // Perform preliminary sanity checks on the block and its transactions.
        err = checkBlockSanity(block, b.chainParams.PowLimit, b.timeSource, flags)
        if err != nil {
                return false, false, err

@@ -205,6 +210,12 @@ func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bo
if err != nil {
return false, false, err
}
// If the block node is present, it means that the header is already verified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/blockchain/process.go b/blockchain/process.go
index f8a06d71..01551514 100644
--- a/blockchain/process.go
+++ b/blockchain/process.go
@@ -212,7 +212,7 @@ func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bo
        }
        // If the block node is present, it means that the header is already verified.
        // So we add this to the behavioural flag and the flag is then passed to further
-       // functions, where we don't need to reavalidate the header.
+       // functions so that we don't re-validate the header.
        if exists {
                flags |= BFHeaderValidated
        }
@@ -242,6 +242,7 @@ func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bo
                        return false, false, err
                }
        }
+
        // Perform preliminary sanity checks on the block and its transactions.a
        err = checkBlockSanity(block, b.chainParams.PowLimit, b.timeSource, flags)
        if err != nil {

err := checkBlockHeaderSanity(header, powLimit, timeSource, flags)
if err != nil {
return err

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/blockchain/validate.go b/blockchain/validate.go
index 65cdee93..5a45e589 100644
--- a/blockchain/validate.go
+++ b/blockchain/validate.go
@@ -468,8 +468,8 @@ func checkBlockSanity(block *btcutil.Block, powLimit *big.Int, timeSource Median
        msgBlock := block.MsgBlock()
        header := &msgBlock.Header
 
-       // If the BFHeaderValidated flag is set, then it means that the header
-       // is already been validated, so we don't need to revalidate it. So we
+       // If the BFHeaderValidated flag is set, it means that the header has
+       // already been validated and we don't need to re-validate it. Only
        // check block header sanity if we BFHeaderValidated is not set.
        if flags&BFHeaderValidated != BFHeaderValidated {
                err := checkBlockHeaderSanity(header, powLimit, timeSource, flags)
@@ -733,9 +733,9 @@ func (b *BlockChain) checkBlockContext(block *btcutil.Block, prevNode *blockNode
        // Perform all block header related validation checks.
        header := &block.MsgBlock().Header
 
-       // If BFHeaderValidated flag is set, then it means that the header is
-       // already been validated, so we don't need to revalidate it. So we check
-       // block header context, only when BFHeaderValidated flag is not set.
+       // If the BFHeaderValidated flag is set, it means that the header has
+       // already been validated and we don't need to re-validate it. Only
+       // check block header sanity if we BFHeaderValidated is not set.
        if flags&BFHeaderValidated != BFHeaderValidated {
                err := b.checkBlockHeaderContext(header, prevNode, flags)
                if err != nil {

if exists || b.IsKnownOrphan(hash) {
// Retrieving node from the blockIndex so that we can check the node
// status.
node := b.IndexLookupNode(hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b.blockExists is calling b.HaveBlock which is just look up the blockhash in the map.

index.LookupNode is calling b.index.LookupNode which is also looking up the blockhash in the map.

Since they're both looking up the same map, there's no way the node can be nil here unless if it's an orphan.

Maybe not handling orphans and the blocks in the same branch would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got your point, that checking for block data for an orphan block has no point, since we can't have an orphan header.
So I can just return true if the block is an orphan.

@@ -228,6 +228,39 @@ func (b *BlockChain) HaveBlock(hash *chainhash.Hash) (bool, error) {
return exists || b.IsKnownOrphan(hash), nil
}

// HaveBlockWithData returns whether or not the chain instance has the block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/blockchain/chain.go b/blockchain/chain.go
index 2ea09665..4bb1416f 100644
--- a/blockchain/chain.go
+++ b/blockchain/chain.go
@@ -229,7 +229,7 @@ func (b *BlockChain) HaveBlock(hash *chainhash.Hash) (bool, error) {
 }
 
 // HaveBlockWithData returns whether or not the chain instance has the block
-// represented by the passed hash This includes checking the various places
+// represented by the passed hash. This includes checking the various places
 // a block can be like part of the main chain, on a side chain, or in the orphan
 // pool. In addition to this the function also checks for presence of block
 // data along with presence of node.

@@ -1008,7 +1008,17 @@ func (sm *SyncManager) handleHeadersMsg(hmsg *headersMsg) {
peer.Disconnect()
return
}

// Processing the block headers that are downloaded and if it passes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/netsync/manager.go b/netsync/manager.go
index dd5b4fa0..c2457f33 100644
--- a/netsync/manager.go
+++ b/netsync/manager.go
@@ -1008,8 +1008,8 @@ func (sm *SyncManager) handleHeadersMsg(hmsg *headersMsg) {
                        peer.Disconnect()
                        return
                }
-               // Processing the block headers that are downloaded and if it passes
-               // all the checks, creating a block node that only contains header.
+               // Process the block headers that are downloaded and if it passes
+               // all the checks, create a block node that only contains header.
                err := sm.chain.ProcessBlockHeader(blockHeader)
                if err != nil {
                        // Note that there is no need to check for an orphan header here

"disconnecting", blockHeader.BlockHash(), peer, err)
peer.Disconnect()
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wouldn't need this check here at line 1023 because it's redundant with the checkpoint check in b.checkHeaderContext that's called in maybeAcceptBlockHeader, which is called from ProcessBlockHeader.

// Ensure chain matches up to predetermined checkpoints.
blockHash := header.BlockHash()
if !b.verifyCheckpoint(blockHeight, &blockHash) {
str := fmt.Sprintf("block at height %d does not match "+
"checkpoint hash", blockHeight)
return ruleError(ErrBadCheckpoint, str)
}
// Find the previous checkpoint and prevent blocks which fork the main
// chain before it. This prevents storage of new, otherwise valid,
// blocks which build off of old blocks that are likely at a much easier
// difficulty and therefore could be used to waste cache and disk space.
checkpointNode, err := b.findPreviousCheckpoint()
if err != nil {
return err
}
if checkpointNode != nil && blockHeight < checkpointNode.height {
str := fmt.Sprintf("block at height %d forks the main chain "+
"before the previous checkpoint at height %d",
blockHeight, checkpointNode.height)
return ruleError(ErrForkTooOld, str)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On line 1023 the check for node.height == sm.nextCheckpoint.Height, is to check that if it is a checkpoint then break the loop over headers, which I think is necessary.

I think the check node.hash.IsEqual(sm.nextCheckpoint.Hash) at line 1024 is redundant and I can remove the if-else and output and process the if condition only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you're right. The new change looks good.

Making the required changes in the processBlock and maybeAcceptBlock
function, such that they work with the headervalidation process and
accept already validated headers.
When the block data is being validated for an already validated
header, then we modify the behavioural flag, so that we don't repeat
the header validation steps again
HaveBlockWithData is a modification of HaveBlock function, but here we
return true only if we have block data also present along with the
block node.
Adding the functionality that the headers would be validated as soon as they are recieved from the peer
Making the required changes in the processBlock and maybeAcceptBlock
function, such that they work with the headervalidation process and
accept already validated headers.
@@ -905,13 +905,13 @@ func (sm *SyncManager) fetchHeaderBlocks() {
}

iv := wire.NewInvVect(wire.InvTypeBlock, node.hash)
haveInv, err := sm.haveInventory(iv)
haveBlock, err := sm.chain.HaveBlockWithData(node.hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just add a TODO or NOTE and mention maybe integrating this into haveInventory so that we can still have a single function that can handle everything?

I like that this is minimal changes but some note above the code would be good so we know to get back to it later.

"disconnecting", blockHeader.BlockHash(), peer, err)
peer.Disconnect()
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you're right. The new change looks good.

@@ -130,3 +131,75 @@ func (g *chaingenHarness) RejectHeader(blockName string, code blockchain.ErrorCo
"marked as known invalid", blockName, blockHash)
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, the code here is taken from TestFullBlocks() is fullblocktests right?

Having duplicate code that performs the same functionality isn't great for maintainability. I need to take a better look at the original Generate() function but it does seem doable to integrate the headers tests there instead of making a whole another set of tests.

Ok seems like a lot of work from a glance. Still something desirable.

accepted()
// Adding a block with valid header and valid spend, but invalid parent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is wrong here. In headers-first you don't accept a header with an invalid parent. The tip is never reset to "b0" and thus it has a valid parent of "b4".

In general, checkpointed headers are assumed to be correct as it's hardcoded into the binary and it's actually a consensus rule (whether or not this is good is besides the point). Once you accept headers, the block must match with the header. If it doesn't I believe the node just hangs.

So we can have tests for rejecting headers but once we accept the headers, it's assumed as correct so the blockdata corresponding to the headers should never be rejected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants