Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aurora block hashchain #705

Closed
wants to merge 85 commits into from
Closed

Aurora block hashchain #705

wants to merge 85 commits into from

Conversation

Casuso
Copy link
Contributor

@Casuso Casuso commented Feb 24, 2023

Description

Aurora Block Hashchain

This implementation correlates to AIP 8.

Hashchain is turned off by default. We need to call start_hashchain on the contract to turned it on. More on this in the additional information below.

We are using a Merkle tree to compute the transactions hash of a block. The tree is actually never constructed; we dynamically maintain only the growing branch, storing hashes of the fully completed subtrees.

The hierarchy of usage is:
-contract (engine/src/lib.rs): Uses BlockchainHashchain to send txs and get the block hashchain.
-standalone: Uses BlockchainHashchain to send txs.
-BlockchainHashchain: Keeps track of block hashchain and other info, and uses BlockHashchainComputer to add txs and get hashes.
-BlockHashchainComputer: Keeps track of the StreamCompactMerkleTree and computes hashes.
-StreamCompactMerkleTree: Maintains the growing subtrees and compact subtrees applying Merkle tree rules.

Performance / NEAR gas cost considerations

To measure the impact on gas consumption of the block hashchain computation, we ran four tests against a base branch and the hashchain branch. After getting the raw gas consumption results from both, we found the delta (difference) between them and used some statistics to show the impact.

Tests 1 and 2 are focused on executing transactions on the same block, which is the most common case as only one transaction triggers the change in block height. For these the average increase observed (delta ave) is less than 0.46 Tgas, and the maximums are less than 0.64 Tgas.

Tests 3 and 4 are focused on executing a transaction on a block height change, that triggers the block hashchain computation, which is the most extensive one. Both show average and max increases of less than 0.62 Tgas.

An increase of 0.64 or 0.62 Tgas is a small amount so we should be fine. For reference, the transaction gas limit is 300 Tgas.

Testing

Several tests to cover the described structures.

Existing tests are sufficient to correlated the state between contract and standalone.

How should this be reviewed

Specifically, I would like to get feedback on the contract entry points methods on engine/src/lib.rs.

I want to make sure that we are reading correctly the input and output in all the methods that include the hashchain update. This is critical since a small difference on the input or output would imply a different hashchain value for all the blockchain.

Also, I want to make sure that we are covering all and only all the methods in the contract that should include the hashchain. The list of all the contract methods with their inclusion or exclusion of hashchain will be added as a comment on this PR so you can review it.

Additional information

The discussed mechanism to start the hashchain was:

  1. We should merge this PR with the hashchain off.
  2. A hashchain seed value should be separately precalculated (and continually updated) starting from Aurora Genesis block.
  3. Aurora DAO should pause the contract. There is a separate PR for that. The hashchain seed value should include that last tx.
  4. Aurora Labs should call start_hashchain passing the hashchain seed with its corresponding block height. This method adds its own data to the hashchain since it changes the state.
    It also resumes the contract, we don't need another call to resume.
  5. We can remove the methods pause, resume or start hashchain if decided.

TODO:

  1. Add an account requirement (from Aurora Labs) in the start_hashchain method.
  2. Merge the pause contract PR and add hashchain update to pause and resume methods since they change the state.

engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/lib.rs Outdated Show resolved Hide resolved
engine/src/hashchain.rs Outdated Show resolved Hide resolved
engine-standalone-storage/src/sync/mod.rs Outdated Show resolved Hide resolved
engine/Cargo.toml Outdated Show resolved Hide resolved
engine/Cargo.toml Outdated Show resolved Hide resolved
engine/src/bloom.rs Show resolved Hide resolved
engine/src/hashchain.rs Show resolved Hide resolved
engine/src/hashchain.rs Outdated Show resolved Hide resolved
@joshuajbouw joshuajbouw added the S-do-not-merge Status: Do not merge label Jul 18, 2023
@joshuajbouw
Copy link
Contributor

Adding do not merge until it is ready for release.

Copy link
Member

@birchmd birchmd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes because there are a few outputs that are not correctly captured by the hashchain. Please also resolve conflicts to be up to date with the latest develop branch.

engine-standalone-storage/src/sync/mod.rs Outdated Show resolved Hide resolved
engine/src/hashchain.rs Show resolved Hide resolved
engine/src/hashchain.rs Outdated Show resolved Hide resolved
engine/src/hashchain.rs Show resolved Hide resolved
engine/src/lib.rs Show resolved Hide resolved
@@ -547,33 +688,43 @@ mod contract {
&mut Runtime,
);
}

update_hashchain(&mut io, function_name!(), &input, &[], &Bloom::default());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be careful about the outputs. The output is not empty in the case of this function. You can see the io object is used inside receive_erc20_tokens. I think I mentioned this before, but I still wonder if it makes sense to try and connect the hashchain with the IO implementation of NearRuntime so that we never miss outputs.

I think the issue we ran into before is how to pass in the additional information (input, function name, bloom) without breaking the IO API. I'm wondering if there is a way to do a sort of builder pattern so that each call to the NearRuntime object updates the piece of the hashchain it knows about, but maybe that will be awkward and not work out well either.

In any case, if we are manually passing outputs then we will need to be very careful that all outputs are properly captured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per our conversation, capturing the output here and in others would require an investigation of how they look on the standalone side (get_output_and_log_bloom method).

Copy link
Contributor Author

@Casuso Casuso Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of a builder pattern should be the best for a strong long-term solution. I think it would be better if the pattern gets implemented at the IO trait level or something slightly below it, so it has a wide impact both on the contract and on the standalone.

@@ -946,15 +1161,18 @@ mod contract {
// that they over paid for their deposit.
unsafe { io.promise_create_batch(&promise) };
}
update_hashchain(&mut io, function_name!(), &input, &[], &Bloom::default());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function also has a non-empty output. And similarly for the other storage_* functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per our conversation, capturing the output here and in others would require an investigation of how they look on the standalone side (get_output_and_log_bloom method).

let mut state = state::get_state(&io).sdk_unwrap();

// *** TODO requires some Aurora Labs account
// require_account(some_AuroraLabs_account);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a require_account statement here for an account owned by Aurora Labs. For security reasons, it's better that I don't create the account.

Copy link
Member

@birchmd birchmd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hashchain implementation still needs a little more iteration as discussed in the comments above regarding how outputs are captured. However, maintaining a feature branch like this with significant code changes requires a lot of work. Given that this PR will have no effect on future Engine deployments until the hashchain feature is enabled in the state by the DAO pausing the contract and us calling start_hashchain, I propose that we merge this PR and iterate on the implementation in smaller follow-up PRs.

To reiterate, my argument is that is safe to merge this PR (all functionality is runtime-gated in a way that only the DAO can enable) and that not merging this PR creates greater maintenance overhead for finishing the work. Therefore merging it now seems better to me than continuing development on the feature branch.

Let me know what you think @joshuajbouw @aleksuss @mandreyel

github-merge-queue bot pushed a commit that referenced this pull request Aug 3, 2023
#810)

## Description

This is the first in a series of PRs that is meant to split up #705 .
The idea is to merge the changes which are made in that PR in logical
chunks until eventually the whole hashchain implementation is in. Doing
the work in smaller pieces will both make it easier to review and
prevent us from needing to maintain large, long-lived feature branches.

This first PR pulls in the transaction transaction parsing logic from
[borealis-engine-lib](https://github.com/aurora-is-near/borealis-engine-lib)
into this repo (in a future PR we will remove the duplicated code from
borealis-engine-lib). The logic is used here to simplify how
transactions are handled in tests because all transactions can
automatically be passed to both the Near runtime (processed by the
Engine as Wasm) and the standalone engine. In particular we remove the
large `if` statement that was starting to get unwieldy.

This work is important both because it makes future tests easier to
write and because it synchronizes the standalone and wasm engine
instances in our tests (this latter point is a prerequisite for properly
testing the hashchain).

Some notes about the PR:

1. I renamed the constant `ORIGIN` to `DEFAULT_AURORA_ACCOUNT_ID`
because I felt the latter is a more descriptive name for what the
constant represents.
2. The standalone engine is now present in `AuroraRunner` by default to
make out testing more robust (for example tests will now automatically
fail if a new state-mutating method is added to the Engine's `lib.rs`
without also being added to the standalone implementation). This means
there are a few places were I need to explicitly remove the standalone
instance where `AuroraRunner` is used as something other than an Engine
instance (modexp and xcc tests).
3. Some tests use view calls to inspect the Engine state and make
assertions. These calls are not present in the standalone because it
only cares about transactions that mutate the state. To make view calls
in `AuroraRunner` it is now required to call `.one_shot()` before the
calls. This essentially tells the testing framework that you are making
a view call so no state modifications will be made and therefore we can
ignore the standalone.
aleksuss pushed a commit that referenced this pull request Aug 10, 2023
#810)

## Description

This is the first in a series of PRs that is meant to split up #705 .
The idea is to merge the changes which are made in that PR in logical
chunks until eventually the whole hashchain implementation is in. Doing
the work in smaller pieces will both make it easier to review and
prevent us from needing to maintain large, long-lived feature branches.

This first PR pulls in the transaction transaction parsing logic from
[borealis-engine-lib](https://github.com/aurora-is-near/borealis-engine-lib)
into this repo (in a future PR we will remove the duplicated code from
borealis-engine-lib). The logic is used here to simplify how
transactions are handled in tests because all transactions can
automatically be passed to both the Near runtime (processed by the
Engine as Wasm) and the standalone engine. In particular we remove the
large `if` statement that was starting to get unwieldy.

This work is important both because it makes future tests easier to
write and because it synchronizes the standalone and wasm engine
instances in our tests (this latter point is a prerequisite for properly
testing the hashchain).

Some notes about the PR:

1. I renamed the constant `ORIGIN` to `DEFAULT_AURORA_ACCOUNT_ID`
because I felt the latter is a more descriptive name for what the
constant represents.
2. The standalone engine is now present in `AuroraRunner` by default to
make out testing more robust (for example tests will now automatically
fail if a new state-mutating method is added to the Engine's `lib.rs`
without also being added to the standalone implementation). This means
there are a few places were I need to explicitly remove the standalone
instance where `AuroraRunner` is used as something other than an Engine
instance (modexp and xcc tests).
3. Some tests use view calls to inspect the Engine state and make
assertions. These calls are not present in the standalone because it
only cares about transactions that mutate the state. To make view calls
in `AuroraRunner` it is now required to call `.one_shot()` before the
calls. This essentially tells the testing framework that you are making
a view call so no state modifications will be made and therefore we can
ignore the standalone.
github-merge-queue bot pushed a commit that referenced this pull request Aug 14, 2023
## Description

This PR continues the effort of merging #705 in multiple pieces.

This PR introduces the core hashchain logic as a library crate. The
reasons to factor the code in this way are:

1. Allows this PR to be safely merged because it has no impact on
existing engine code
2. The same core logic will be reusable between components that will
perform the hashchain computation in the future, namely: Aurora Engine
smart contract, standalone engine, Borealis Refiner.

In the next PR I'll pull in the changes from #705 which actually
introduce the hashchain calculation into the Aurora Engine smart
contract and standalone engine.

## Performance / NEAR gas cost considerations

N/A

## Testing

New hashchain tests

---------

Co-authored-by: Oleksandr Anyshchenko <oleksandr.anyshchenko@aurora.dev>
birchmd added a commit that referenced this pull request Aug 23, 2023
## Description

This PR continues the effort of merging #705 in multiple pieces.

This PR introduces the core hashchain logic as a library crate. The
reasons to factor the code in this way are:

1. Allows this PR to be safely merged because it has no impact on
existing engine code
2. The same core logic will be reusable between components that will
perform the hashchain computation in the future, namely: Aurora Engine
smart contract, standalone engine, Borealis Refiner.

In the next PR I'll pull in the changes from #705 which actually
introduce the hashchain calculation into the Aurora Engine smart
contract and standalone engine.

## Performance / NEAR gas cost considerations

N/A

## Testing

New hashchain tests

---------

Co-authored-by: Oleksandr Anyshchenko <oleksandr.anyshchenko@aurora.dev>
aleksuss added a commit that referenced this pull request Aug 30, 2023
## Description

This PR continues the effort of merging #705 in multiple pieces.

This PR introduces the core hashchain logic as a library crate. The
reasons to factor the code in this way are:

1. Allows this PR to be safely merged because it has no impact on
existing engine code
2. The same core logic will be reusable between components that will
perform the hashchain computation in the future, namely: Aurora Engine
smart contract, standalone engine, Borealis Refiner.

In the next PR I'll pull in the changes from #705 which actually
introduce the hashchain calculation into the Aurora Engine smart
contract and standalone engine.

## Performance / NEAR gas cost considerations

N/A

## Testing

New hashchain tests

---------

Co-authored-by: Oleksandr Anyshchenko <oleksandr.anyshchenko@aurora.dev>
@birchmd
Copy link
Member

birchmd commented Sep 1, 2023

Closing in favour of #831

@birchmd birchmd closed this Sep 1, 2023
@aleksuss aleksuss deleted the aurora_block_hashchain branch October 16, 2023 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-do-not-merge Status: Do not merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants