Implementing weights in Madara #968
Replies: 2 comments 1 reply
-
This sounds like a gas mispricing issue, such as that which caused the Shanghai attacks in 2016 The main purpose of Weights in Substrate is to assign costs for coarse blocks of code and thereby gain efficiency over metering after every addition, multiplication, storage read, memory access, etc. Substrate provides infrastructure for benchmarking for those code blocks in order to meter them correctly, but I don't believe this is a hard requirement. Since it seems based on the description and link you've provided like Cairo already limits gas or steps within a contract (and please correct me if I'm wrong), it might be reasonable to simply treat the Weight limit as a gas limit (plus some extra for non-Cairo logic, such as pallet initialization and finalization). However, highly varying time-costs per gas are a major issue which likely requires setting the overall gas limit conservatively or repricing certain instructions, or else be vulnerable to gas mispricing attacks. |
Beta Was this translation helpful? Give feedback.
-
Update Some update on this issue after internal discussion and extensive testing using gomu gomu. Quoting @apoorvsadana
And @tdelabro
In conclusion, madara ordering policy needs to be clearly defined and configurable (GPA, FCFS or more complex ones). Related Discussion https://substrate.stackexchange.com/questions/10517/how-are-the-node-threads-and-tasks-managed |
Beta Was this translation helpful? Give feedback.
-
Hey everyone, I am opening this thread as per my discussion with @EvolveArt and @tdelabro regarding the implementation of weights in Madara. I have tried to add all the necessary details and links needed to understand the complications involved in this process. The main purpose of this discussion is to evaluate different possible methods to implement weights (or any other solution) that solves the attack vectors mentioned and at the same time, keeps block production efficient and within limits.
About weights
The main purpose of weights in substrate is to ensure that the chain can create blocks within the targeted execution time. For a block time of 6 seconds, it's recommended that we have an execution time of around 2 seconds. By using the right weights, it becomes possible for the node to recognise which transactions will cause the execution time to cross 2 seconds and the node can choose to skip these transactions and include them in the next block. For example, if the node has already been executing transactions for 1.5 seconds and a new transaction will take 1 second to run (we know this because of the weight), we will skip it and add it to the next block.
Assigning weights to a transaction
The weight of a transaction is the time it takes to execute the transaction in picoseconds. Now this is tricky because the only way to know this number is by executing the transaction itself. But the purpose of weights is to tell if we should include a transaction within a block or not before executing it. So then we need to give substrate a way to guess the execution time.
On frontier (an EVM chain made on substrate), the gas of the transaction is used to calculate the weight. This makes sense as the gas on EVM chains is directly related to the computational resources needed to execute the transaction and hence the execution time. So once you know you want to use the gas to estimate the weight, the next question is how much weight do you assign per gas? Well, you assume the worst case possible i.e. the block will consume the max gas limit of the block. Now, since you know the total weight of the block (2 seconds in terms of picoseconds) and the total gas consumed, you can find the
weight per unit gas
. We assume the worst case to be sure we don't exceed the 2 seconds block time.Now, on chains like Starknet, the math isn't very straightforward because gas on Starknet is a measure of how long it takes to prove the transaction and isn't directly related to the execution time. However, at the same time, the gas is the best variable we have in the transaction input which can be used to estimate the complexity of the transaction. Hence, we have to use this to estimate the weight (this does make benchmarking a little complicated as explained below).
Benchmarking
Now Starknet has its own limits on the total steps in a block/transaction and total gas of a block. However, these limits were created keeping in mind the architecture of Starkware's sequencer. We need to re-calculate these limits for the architecture, and specifically the block time we choose in substrate.
As of writing this, we have an execution time of 2 seconds. This means, our
BLOCK_GAS_LIMIT
is the maximum gas we can include in a block if we run transactions for 2 seconds. This, clearly, is very subjective and this number would vary largely on different machines. Hence, to arrive at the correct number, we defineBLOCK_GAS_LIMIT
asWhile this still isn't a deterministic number and the exact
BLOCK_GAS_LIMIT
would differ slightly over executions, the difference would be small and negligible as long as it's run on the same specific machine.Now, on how to practically calculate this number, we can use benchmarking tools provided by substrate. On EVM chains, you can simply benchmark against a few contracts and check the time it takes to execute 1 unit of gas. Since gas is directly related to execution time, we should get fairly consistent results and should be able to calculate our
BLOCK_GAS_LIMIT
for 2 seconds.On Starknet, however, the time taken to execute 1 unit of gas (weight per gas), can vary drastically across contracts depending upon what the contract is doing. As mentioned above, this is mainly because of the fact that gas is not a direct measure of execution time. Consequently, we need to come up with techniques to calculate the approximate weight per unit gas. Some possible ways to do this are
These are just 2 methods but there is scope for improvement here and we might be able to come up with better approaches that can be more efficient.
Current behaviour
Right now, we assign a 0 weight to all our transactions. Hence, there's no way for substrate to estimate how long a transaction will take before actually executing it. As a result, currently, our node keeps adding transactions to the block till we exceed the block execution time after which it stops. A few points to note here are
discarding proposal for slot 281845487; block production took too long
. However, in the very next slot, the node includes the transaction (which was already executed) and successfully proposes a block. It's also important to note here that the node shouldn't be allowed to execute 40 million steps with a 2 second execution time, so the fact that we are able to do this is a bug itself. And even with the bug, the chain doesn't seem to halt as we are able to add the transaction in the next block.Attack vector in the current setup
Our current setup doesn't have any limits on the total steps or gas allowed in a block. The only limit that all nodes agree on is a 6 seconds block time and 2 seconds time to execute transactions/import a block. However, time, as mentioned before, is a non-deterministic measure. What might take 2s to execute on one machine might take a minute on another. Hence, a possible attack vector in this case is
Steps from here
The following are the possible actions we can take from here
BLOCK_GAS_LIMIT
for the chain (and make it configurable as app chains would want to set their own limits).BLOCK_GAS_LIMIT
Beta Was this translation helpful? Give feedback.
All reactions