Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize interpreter::blockchain::{load_contract_code, code_copy} to read contract starting from offset #847

Merged
merged 20 commits into from
Dec 3, 2024

Conversation

acerone85
Copy link
Contributor

@acerone85 acerone85 commented Oct 4, 2024

[Link to related issue(s) here, if any]

Closes #681

[Short description of the changes.]

When loading the contract code into the stack, we use the new semantics of StorageRead from #863 to only fetch the portion of a contract starting from the specified offset. We also read the contract bytes directly into the stack.

A similar optimization is done for code_copy.

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog
  • New behavior is reflected in tests
  • If performance characteristic of an instruction change, update gas costs as well or make a follow-up PR for that
  • The specification matches the implemented behavior (link update PR if changes are needed)

Before requesting review

  • I have reviewed the code myself
  • I have created follow-up issues caused by this PR and linked them here

After merging, notify other teams

[Add or remove entries as needed]

@acerone85 acerone85 marked this pull request as draft October 4, 2024 16:06
@acerone85 acerone85 force-pushed the 681_remove_storage_read branch from 318b0b3 to 9f4117d Compare October 4, 2024 16:20
@acerone85 acerone85 force-pushed the 681_remove_storage_read branch from 9f4117d to ab0cc7b Compare October 4, 2024 16:30
@acerone85 acerone85 self-assigned this Oct 14, 2024
@acerone85 acerone85 marked this pull request as ready for review October 14, 2024 12:47
@acerone85 acerone85 requested a review from Voxelot as a code owner October 14, 2024 12:47
Copy link
Contributor

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks alright to me. The TODO comment needs to be resolved before I can approve. There's also some code duplication that I think we could factor out, but that's not blocking from my end.

fuel-vm/src/interpreter/blockchain.rs Outdated Show resolved Hide resolved
fuel-vm/src/interpreter/blockchain.rs Outdated Show resolved Hide resolved
netrome
netrome previously approved these changes Oct 22, 2024
Copy link
Contributor

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice stuff!

@netrome
Copy link
Contributor

netrome commented Oct 22, 2024

Poke me when you've fixed the clippy errors and I'll re-approve

netrome
netrome previously approved these changes Oct 22, 2024
Copy link
Member

@MitchTurner MitchTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a VM guy, so maybe I could get you to walk me through these changes.

CHANGELOG.md Outdated
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](http://semver.org/).

## [Unreleased]

### Changed
- [#847](https://github.com/FuelLabs/fuel-vm/pull/847): Remove `contract_read`, and change `load_contract_code`, `code_copy` and `code_root` to explicitly load the contract code in a buffer. Also check for mismatches between contract size stored and actual size of contract in those functions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this. There is no function called contract_read that is removed in these changes. And you added a call to read_contract, which sounds like the same thing?

.ok_or(PanicReason::ContractNotFound)?
.map_err(RuntimeError::Storage)?;

if contract_buffer.len() != contract_len {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's already allocated to to contracct_len size, will the len call ever not be contract_len?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a leftover from refactoring, most probably this check should not be there.
(Btw, I noticed that we have a slightly different implementation of the same function:

fn read_contract<S>(
, so maybe we should factor out the common code and have only one function instead)

let contract = super::contract::contract(self.storage, &contract_id)?;
let contract_bytes = contract.as_ref().as_ref();

let contract_buffer: Vec<u8> = load_contract_code_from_storage(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing this change, I see that we basically have the same code as before and allocate vectors in all cases.

So basically, this change doesn't make a lot of sense=D

The main problem is why we can't do the optimal implementation and copy directly into the memory - contract offset(and blob offset, since we have the same problem for blobs LDC with mode == 1 and BLDD opcode).

I think we need to extend the StorageRead trait if we want to fix this issue in a proper way. A new method will do the same as the read or change the read method itself to something like:

fn read(&self, key: &Type::Key, offset: usize, buf: &mut [u8])
        -> Result<Option<usize>, Self::Error>

Could you change this method to work with offset and update all code to not allocate vector, please?

Also, it would be nice if you do the same for blobs as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the original idea was to have only one allocation. The only actual improvement in this PR as is now is fixing <MemoryStorage as StorageRead>::read`, which was panicking before due to slices of been different size being copied.

Happy to extend the trait to read. I think it would be better to add a new function read_from_offset, and then change read to be a provided method with offset 0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the new method and with reworking the old one. I think changing the old one makes more sense because the compiler will help you identify places that also need to be updated in the code to avoid allocation of the Vec. Plus this read function only exists for load opcodes, so they mainly define their existence, usage and shape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, just to make sure: if we opt for modifying the StorageRead::read function, then there will be several places where the implementation needs to be updated, both in fuel-vm and fuel-core; also it is not clear to me that we can read from an arbitrary offset easily in all the implementations of the StorageRead trait. But I will try and see how far can I go before getting into trouble

@acerone85
Copy link
Contributor Author

acerone85 commented Oct 23, 2024

I'm not a VM guy, so maybe I could get you to walk me through these changes.

Happy to do it once I address the comments left by @xgreenx :)

let contract_bytes = contract.as_ref().as_ref();
let contract_len = contract_bytes.len();
let contract_bytes = self.memory.write(self.owner, dst_addr, length)?;
let contract_len = contract_size(&self.storage, &contract_id)?;
let charge_len = core::cmp::max(contract_len as u64, length);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never read more than length bytes with the new version. Should charge_len be changed accordingly to reflect this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a breaking change, and should be done separately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer to keep the current charging strategy, even if we overcharge user=) The length in most cases should be not more than contract_len.

@acerone85
Copy link
Contributor Author

With the new version the function code_root still uses the old StorageInspect::get<RawContractCode> function. This seems to be okay to me, as the function will return a borrowed reference to the whole contract, which will then be moved into chunks into a MerkleTree structure to compute the code root. In this case, using StorageRead::read<RawContractCode> would require allocating more space to store the contract code, which can be avoided. Do you agree with this?

@acerone85 acerone85 changed the title Remove storage read Optimize interpreter::blockchain::{load_contract_code, code_copy to read contract starting from offset Nov 13, 2024
// to invoke `self.storage.read_contract`. Furthermore,
// the call to `self.storage.read_contract` cannot fail with
// error `StorageError::`
#[allow(clippy::cast_possible_truncation)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to remove it and contract_len.into() above, and convert contract_offset into the u32 at the begin of the function explicitly returning an error. In this case you don't need to have a comment

// error `StorageError::`
#[allow(clippy::cast_possible_truncation)]
self.storage
.read_contract(&contract_id, contract_offset as usize, contract_buffer)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fuel-vm should be responsible for extracting subsection from contract_buffer and passing it to the read_contract function.

Plus, it should be responsible for filling contract_buffer[(length - (contract_len - contract_offset))..].fill(0); as well.

Because it is requirement from the specification and we don't want to leak business logic to the storage.

The same is related to another place in the PR.

Copy link
Collaborator

@xgreenx xgreenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dentosal Could you also review this PR please?=)

@Dentosal Dentosal changed the title Optimize interpreter::blockchain::{load_contract_code, code_copy to read contract starting from offset Optimize interpreter::blockchain::{load_contract_code, code_copy} to read contract starting from offset Dec 3, 2024
@xgreenx xgreenx added this pull request to the merge queue Dec 3, 2024
Merged via the queue into master with commit 917f3fa Dec 3, 2024
40 checks passed
@xgreenx xgreenx deleted the 681_remove_storage_read branch December 3, 2024 02:18
@xgreenx xgreenx mentioned this pull request Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use methods from StorageRead trait in the LDC and copy code opcodes
5 participants