-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blockchain: Decouple processing and download logic. #2518
Conversation
Oh, one other thing I forgot to call out is that the In order to increase confidence in the notifications in general, I also added some logging for them as they're sent and compared the before and after of the full block tests as follows:
As can be seen the same number of and type of notifications are being produced. Closer inspection at the logs shows they are produced in the same order as expected as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment typos.
4e616cc
to
d250928
Compare
d250928
to
6e16986
Compare
6e16986
to
248bdcd
Compare
248bdcd
to
d64afaf
Compare
d64afaf
to
e041543
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got through the tests so far and called out a few typos. The process tests read very well and are super helpful for visualizing all of the various scenarios this takes into account!
ae6b99f
to
434823a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome.
On testnet there is a lot of output. One warning a few times [WRN] FEES: Trying to process mined transactions at block 587476 when previous best block was at height 587476
at different heights. Also this info [INF] CHAN: FORK: Block 0000002b64225f709a777cbb2acdb011705d8f43bbf32888eeed8476c5c4ecf1 (height 587489) forks the chain at height 587488/block 0000009a0b2f93131a56d24b5151337ebcdd259515b0ef9cf204e44469d01636, but does not cause a reorganize
numerous times with different values for the same block. I don't see either of those logs on mainnet, so wondering why such a difference.
All expected?
I also observed a successful reorg happen on testnet.
I am unable to use wallet with the new rpc api version yet, so unable to test sending transactions and such easily.
Yes, testnet has a lot of competing blocks show up around the time the difficulty drops. Mainnet does not have a difficulty reduction like that, so it doesn't happen there. The fees warning is also normal. It happens for all forked blocks. It probably doesn't actually need to be a warning, but the behavior is the same on master. |
434823a
to
7f73122
Compare
7f73122
to
c0aaa33
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished the first pass review. Aside from the minor issues, looks good. Will do another general pass after the duplicate positional checks issue is addressed. I also ran the sync a bunch of times in mainnet.
Just wanna say, this really mustn't have been fun to write, since it involves such a critical piece of the processing code for the blockchain. Definitely was hard to review, since the individual checks are all interdependent in the critical commit.
OTOH, the resulting decoupled processing is a very nice thing to achieve :P
// variety of fairly complex scenarios selects the expected best chain and | ||
// properly tracks the header with the most cumulative work that is not known to | ||
// be invalid as well as the one that is known to be invalid (when it exists). | ||
func TestProcessLogic(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, just wanna hightlight this is an awesome test :P
600154d
to
bf07a55
Compare
This has been resolved. I refactored the checks which apply to the block data to a separate function and call it instead when accepting the block data since the headers have already been checked at that point. |
bf07a55
to
a327a43
Compare
a327a43
to
a78f426
Compare
Did a second pass review of the code (as of now), taking note of the final structure for
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also ran the sync again a few times and executed the test suite on every commit.
Woop!
a78f426
to
b6b2be8
Compare
I updated this to ensure the index lock is not held during flushing in case pruned ticket information has to be reloaded which itself requires the index lock and to add a couple of trace logging statements for connected and disconnected blocks. Also, rebased. |
This makes the logging in the test harness more consistent and improves it to make use of the ability to look up the test name for blocks when logging issues for easier identification.
This modifies the chain test harness to allow the initial blocks that are generated to specified heights, such as stake validation height, to be independently generated without also accepting them to the chain. It also allows a custom underlying chain generator which houses those blocks to be provided when creating a test harness. These two combined provide more flexibility to the tests since they can choose when (and if) to make the generated blocks available to the chain.
b6b2be8
to
76b3e5a
Compare
This PR looks solid overall. Thank you for taking the time to write this along with clear comments and tests. I am still working through it, but have not seen any issues with the latest code thusfar. — On a mostly unrelated note with respect to this PR, while syncing I did notice a log message that appears incorrect. It cropped up while testing this PR so I figured it worth mentioning at least even though it's not coming from this set of changes.
In server.go -> OnNotFound, the following line reason := fmt.Sprintf("%d %v not found", numBlocks, txStr) seems like it should be reason := fmt.Sprintf("%d %v not found", numTxns, txStr) |
This completely reworks the way block index and processing works to use headers-first semantics and support out of order processing of the associated block data. This will ultimately provide a wide variety of benefits as it means the entire shape of the block tree can be determined from the headers alone which in turn will allow much more informed decisions to be made, provides better information regarding sync status and warning conditions, and allows future work to add invalidation and reconsideration of arbitrary blocks. It must be noted that this is a major change to the way the block index and block handling is done and is deeply integrated with the consensus rules, so it will need a significant amount of testing and extremely careful review. Also of note is that there are a lot of assumptions made in the calling code in regards to expected error attribution and the state the chain has at the moment notifications are processed, so all of those semantics have intentionally been retained, even though they might seem a bit out of place now, in order to limit the amount of changes introduced at a time and thus better ensures correctness. For example, in the future it would probably make more sense to make the notifications entirely asynchronous and for validation failures to be reported via those asynchronous notifications. However, before that can happen, all callers would first need to be updated to ensure they do not rely on the chain being in the same state it was at the moment the notification was generated. In addition to all of the existing full block tests, this introduces a comprehensive set of processing order tests which exercise all of the new logic. High level overview of the changes: - Introduce tracking for whether a block is fully linked, meaning it builds on a branch that has block data for all of its ancestors - Add received order tracking to ensure miners are not able to gain an advantage in terms of chain selection by only advertising a header - Add several new pieces of information to the block index: - Header with the most cumulative work that is not known to be invalid - Header with the most cumulative work that is known to be invalid - Map of best chain candidates to aid in efficient selection - Map of unlinked children for more efficient linking - Prunable cached chain tips to significantly reduce potential search space during invalidation - Introduce a compare function which determines which of two nodes should be considered better for the purposes of best chain selection - Add chain tip iteration capabilities with potential filtering - Mark all descendants invalid due to known invalid ancestor when a block is marked invalid - Add ability to determine if a block can currently be validated - Rework the chain reorganization func to work with an arbitrary target - Modify the overall chain reorg func to reorg to the block with the most cumulative work that is valid in the case it is not possible to reorg to the target block - Add a new MultiError type for keeping track of multiple errors in the same call - Add a new ErrNoBlockData error kind - Update all cases that only require all of the ancestor block data to be available to check for that condition instead of the more strict condition of already being fully validated - Introduce a new processing lock separate from the overall chain lock - Add new method and ability to independently accept block headers - Rework the block processing func to accept blocks out of order so long as their header is already known valid - Keep a cache of blocks that recently passed contextual validation checks - Cleanup and correct various comments to match reality - Add comprehensive tests to exercise the new processing logic and best header tracking (for both invalid and not known invalid)
This reworks the logic that is used to determine if the chain believes it is current to be more robust and to take advantage of the fact that the best known header is now available. In particular, it introduces two new constraints: - A latching mechanism that prevents the chain from flipping back to being non current unless no new blocks have been seen for an extended period of time - The chain only latches to current once it is synced to the header with the most cumulative work that is not known to be invalid in addition to the existing conditions
This modifies the getblockchaininfo RPC handler to use the newly-exposed best header that is not known to be invalid from chain versus the height of the current best chain tip.
This modifies the README.md and doc.go files to match the latest code.
76b3e5a
to
12e132e
Compare
Thanks to everyone for reviewing this. I know it was not a trivial one to review and took a lot of effort. |
This completely reworks the way block index and processing works to use headers-first semantics and support out of order processing of the associated block data.
This will ultimately provide a wide variety of benefits as it means the entire shape of the block tree can be determined from the headers alone which in turn will allow much more informed decisions to be made, provides better information regarding sync status and warning conditions, and allows future work to add invalidation and reconsideration of arbitrary blocks.
It must be noted that this is a major change to the way the block index and block handling is done and is deeply integrated with the consensus rules, so it will need a significant amount of testing and extremely careful review. To that end, the changes have been split up into several commits to help ease review, but, unfortunately the vast majority of them really couldn't be reasonably split, so the bulk of hardest to review changes are in a single commit.
Also of note is that there are a lot of assumptions made in the calling code in regards to expected error attribution and the state the chain has at the moment notifications are processed, so all of those semantics have intentionally been retained, even though they might seem a bit out of place now, in order to limit the amount of changes introduced at a time and thus better ensures correctness.
For example, in the future it would probably make more sense to make the notifications entirely asynchronous and for validation failures to be reported via those asynchronous notifications. However, before that can happen, all callers would first need to be updated to ensure they do not rely on the chain being in the same state it was at the moment the notification was generated.
Another significant change is that the logic that is used to determine if the chain believes it is current is now more robust and takes advantage of the fact the best known header is now available.
In particular, it introduces two new constraints:
Finally, in addition to all of the existing full block tests, this introduces a comprehensive set of processing order tests which exercise all of the new logic.
High level overview of the changes:
MultiError
type for keeping track of multiple errors in the same callErrNoBlockData
error kindREADME.md
anddoc.go
to accommodate the changesThis is work towards #1145.