- Optimization on Ethereum: Make a Difference with Function Names
- Cost optimization in gas is crucial for smart contracts on Ethereum.
- The "function dispatcher" manages the execution of functions in smart contracts for EVMs.
- Solidity compiler generates the "function dispatcher" for publicly exposed functions, whereas in Yul, it needs to be coded.
- Signatures, hashes, and footprints of functions are determined by their names and parameter types.
- The compiler's optimization setting and the number of functions impact the function selection algorithm.
- Strategic renaming of functions optimizes gas costs and the selection order, influenced by footprint values.
Cost optimization in gas is a key challenge in the development of smart contracts on the Ethereum blockchain, as each operation on Ethereum incurs a gas cost. This article is the translation of Optimisation sur Ethereum : Faites la différence avec les noms de fonctions (🇫🇷).
Reminder :
- The bytecode represents a smart contract on the blockchain as a sequence of hexadecimal values.
- The Ethereum Virtual Machine (EVM) executes instructions by reading this bytecode during interactions with the contract.
- Each elementary instruction, encoded in one byte, is called an opcode and has a gas cost reflecting the resources required for its execution.
- A compiler translates this source code into bytecode executable by the EVM and provides elements such as the Application Binary Interface (ABI).
- An ABI defines how a contract's functions should be called and data exchanged, specifying the data types of arguments and the functions' signatures.
In this article, we will explore how simply naming your functions can influence the gas costs associated with your contract.
We will also discuss various optimization strategies, from the order of signature hashes to function renaming tricks, to reduce costs associated with interactions with your contracts.
Details :
This article is based on:
- Solidity code (0.8.13, 0.8.17, 0.8.20, 0.8.22)
- Compiled using the
solc
compiler - For EVMs on Ethereum
The following concepts will be covered:
- The signature: the numerical identifier of a function within the EVM.
- The "function dispatcher": the mechanism for selecting a function within a contract.
- And the function name as an argument (on the caller side).
The signature of a function as used with the Ethereum Virtual Machines (EVMs) (Solidity) consists of the concatenation of its name and parameter types (excluding return type and spaces).
The function selector is the unique identifier for the function. In Solidity, this involves the 4 most significant bytes (32 bits) of the result of hashing the function's signature with the Keccak-256 algorithm.
This is based on the Solidity ABI specifications.
I would like to emphasize again that I am referring to the function selector for the solc compiler for Solidity, and this might not be the case for other languages like Rust, which operates on a completely different paradigm.
Considering parameter types is essential to differentiate functions with the same name but different parameters, as seen in the safeTransferFrom
method of ERC721 tokens.
However, the fact that only four bytes are retained for the function selector implies potential hash collision risks between two functions—a rare but existing risk despite over 4 billion possibilities (2^32).
As evidenced by the Ethereum Signature Database with the following example:
Function selectors | Signatures |
---|---|
0xcae9ca51 |
onHintFinanceFlashloan(address,address,uint256,bool,bytes) |
0xcae9ca51 |
approveAndCall(address,uint256,bytes) |
Fortunately, a simple Solidity contract with these two functions does not compile.
TypeError: Function signature hash collision for approveAndCall(address,uint256,bytes)
--> contracts/HashCollision.sol:10:1:
|
10 | contract HashCollision {
| ^ (Relevant source part starts here and spans across multiple lines).
However, this remains problematic: Check out the challenge Hint-finance, in the Web3 Hacking: Paradigm CTF 2022.
The "Function Dispatcher" (or function manager) in smart contracts written for the EVMs is a component of the contract that determines which function should be executed when someone interacts with the contract through an ABI.
In essence, the "Function Dispatcher" is like a conductor during calls to the functions of a smart contract. It ensures that the right functions are called when you perform specific actions on the contract.
When interacting with a smart contract through a transaction, you specify which function you want to execute. The "function dispatcher" thus links the command to the specific function that will be called.
The function's signature is retrieved from the calldata
during contract execution, and a revert
occurs if the call cannot be matched with a function of the contract.
The selection mechanism is similar to that of a switch/case
structure or a set of if/else
statements.
Applying what has been discussed above, we obtain, for the following function:
function square(uint32 num) public pure returns (uint32) {
return num * num;
}
The following signatures, hash, and selectors :
Fonction | square(uint32 num) public pure returns (uint32) |
---|---|
Signature | square(uint32) (1) |
Hash | d27b38416d4826614087db58e4ea90ac7199f7f89cb752950d00e21eb615e049 |
Selector | d27b3841 |
(1) : Keccak-256 online calculator : square(uint32)
In Solidity, the "function dispatcher" is generated by the compiler, so there's no need to handle the coding of this complex task.
It only applies to functions in a contract that are accessible from outside the contract, thus having an access attribute of external and public.
-
External: External functions are designed to be called from outside the contract, typically by other contracts or external accounts. It is the visibility to expose a public interface to your contract.
-
Public: Public functions are accessible from both outside and inside the contract.
-
Internal and Private: Internal and private functions can only be called from inside the contract (and contracts inheriting from it in the case of internal).
Example #1:
pragma solidity 0.8.13;
contract MyContract {
uint256 public value;
uint256 internalValue;
function setValue(uint256 _newValue) external {
value = _newValue;
}
function getValue() public view returns (uint256) {
return value;
}
function setInternalValue(uint256 _newValue) internal {
internalValue = _newValue;
}
function getInternalValue() public view returns (uint256) {
return internalValue;
}
}
If we revisit the previous code used as an example, we obtain the following signatures and footprints:
Fonctions | Signatures | Keccak | Selectors |
---|---|---|---|
setValue(uint256 _newValue) external |
setValue(uint256) |
55241077...ecbd |
55241077 |
getValue() public view returns (uint256) |
getValue() |
20965255...ad96 |
20965255 |
setInternalValue(uint256 _newValue) internal |
setInternalValue(uint256) |
6115694f...7ce1 |
6115694f |
getInternalValue() public view returns (uint256) |
getInternalValue() |
e778ddc1...c094 |
e778ddc1 |
(The hashes from Keccak have been intentionally truncated)
If we examine the ABI generated during compilation, the function setInternalValue()
does not appear, which is expected as its visibility is internal
(see above).
It is noteworthy in the ABI data, the reference to the value
storage data, which is public
(we will come back to this later).
Here is an excerpt of the "function dispatcher" code generated by the solc
compiler (Solidity version: 0.8.13). It can be observed that the numerical value of the fingerprint is retrieved from the calldata
, and this value is then compared to the different function signatures, allowing for a "jump" to the code of the desired function.
tag 1
JUMPDEST
POP
PUSH 4
CALLDATASIZE
LT
PUSH [tag] 2
JUMPI
PUSH 0
CALLDATALOAD
PUSH E0
SHR
DUP1
PUSH 20965255 // ◄ signature : getValue()
EQ
PUSH [tag] getValue_0
JUMPI
DUP1
PUSH 3FA4F245 // ◄ signature : value (automatic storage getters)
EQ
PUSH [tag] 4
JUMPI
DUP1
PUSH 55241077 // ◄ signature : setValue(uint256)
EQ
PUSH [tag] setValue_uint256_0
JUMPI
DUP1
PUSH E778DDC1 // ◄ signature : getInternalValue()
EQ
PUSH [tag] getInternalValue_0
JUMPI
tag 2
JUMPDEST
PUSH 0
DUP1
REVERT
In diagram form, one can better understand the selection mechanism, similar to that of a switch/case
structure or a set of if/else
statements.
Important: The evaluation order of functions is not the same as their declaration order in the code!
Evaluation Order | Order in the code | Selectors | Signatures |
---|---|---|---|
1 | 3 | 20965255 |
getValue() |
2 | 1 | 3FA4F245 |
value (automatic getter) |
3 | 2 | 55241077 |
setValue(uint256) |
4 | 4 | E778DDC1 |
getInternalValue() |
Indeed, function signature evaluations are ordered by an ascending sort on their values.
20965255
< 3FA4F245
< 55241077
< E778DDC1
The function with the selector 3FA4F245
is actually an automatic getter for the public data value
, and it is generated by the compiler. In Solidity, the compiler automatically provides a public getter for any public storage variable.
uint256 public value;
We can find the selection footprint (3FA4F245
) and the function (at tag 4
address) of the automatic getter for this variable in the opcodes.
Selector :
DUP1
PUSH 3FA4F245
EQ
PUSH [tag] 4
JUMPI
Fonction :
tag 4
JUMPDEST
PUSH [tag] 11
PUSH [tag] 12
JUMP [in]
tag 11
JUMPDEST
PUSH 40
MLOAD
PUSH [tag] 13
SWAP2
SWAP1
PUSH [tag] abi_encode_tuple_t_uint256__to_t_uint256__fromStack_reversed_0
JUMP [in]
tag 13
JUMPDEST
PUSH 40
MLOAD
DUP1
SWAP2
SUB
SWAP1
RETURN
"getter
actually has the same code as the getValue()
function."
tag getValue_0
JUMPDEST
PUSH [tag] getValue_1
PUSH [tag] getValue_3
JUMP [in]
tag getValue_1
JUMPDEST
PUSH 40
MLOAD
PUSH [tag] getValue_2
SWAP2
SWAP1
PUSH [tag] abi_encode_tuple_t_uint256__to_t_uint256__fromStack_reversed_0
JUMP [in]
tag getValue_2
JUMPDEST
PUSH 40
MLOAD
DUP1
SWAP2
SUB
SWAP1
RETURN
Demonstrating the futility of having the variable value
with the public
attribute in conjunction with the getValue()
function, and also highlighting a weakness in the Solidity compiler solc
that cannot merge the code of the two functions.
For those interested in delving deeper, here is a link to a detailed article on automatic storage getters
in Solidity.
Here is an excerpt from an example of a ERC20 contract entirely written in Yul.
While Solidity provides abstraction and readability, Yul, a lower-level language close to assembly, allows for much finer control over execution.
object "runtime" {
code {
// Protection against sending Ether
require(iszero(callvalue()))
// Dispatcher
switch selector()
case 0x70a08231 /* "balanceOf(address)" */ {
returnUint(balanceOf(decodeAsAddress(0)))
}
case 0x18160ddd /* "totalSupply()" */ {
returnUint(totalSupply())
}
case 0xa9059cbb /* "transfer(address,uint256)" */ {
transfer(decodeAsAddress(0), decodeAsUint(1))
returnTrue()
}
case 0x23b872dd /* "transferFrom(address,address,uint256)" */ {
transferFrom(decodeAsAddress(0), decodeAsAddress(1), decodeAsUint(2))
returnTrue()
}
case 0x095ea7b3 /* "approve(address,uint256)" */ {
approve(decodeAsAddress(0), decodeAsUint(1))
returnTrue()
}
case 0xdd62ed3e /* "allowance(address,address)" */ {
returnUint(allowance(decodeAsAddress(0), decodeAsAddress(1)))
}
case 0x40c10f19 /* "mint(address,uint256)" */ {
mint(decodeAsAddress(0), decodeAsUint(1))
returnTrue()
}
default {
revert(0, 0)
}
/* ---------- calldata decoding functions ----------- */
function selector() -> s {
s := div(calldataload(0), 0x100000000000000000000000000000000000000000000000000000000)
}
...
It features the same cascading if/else
structure as in the previous diagram.
Creating a contract entirely in Yul requires coding the "function dispatcher" manually, allowing one to choose the order of processing imprints and utilize algorithms beyond a simple cascading test suite.
Now, here's a completely different example to illustrate that things are actually more complex!
Because depending on the number of functions and the optimization level (see: --optimize-runs
), the Solidity compiler behaves differently!
Example #2:
// SPDX-License-Identifier: GPL-3.0
pragma solidity 0.8.17;
contract Storage {
uint256 numberA;
uint256 numberB;
uint256 numberC;
uint256 numberD;
uint256 numberE;
// selector : C534BE7A
function storeA(uint256 num) public {
numberA = num;
}
// selector : 9AE4B7D0
function storeB(uint256 num) public {
numberB = num;
}
// selector : 4CF56E0C
function storeC(uint256 num) public {
numberC = num;
}
// selector : B87C712B
function storeD(uint256 num) public {
numberD = num;
}
// selector : E45F4CF5
function storeE(uint256 num) public {
numberE = num;
}
// selector : 2E64CEC1
function retrieve() public view returns (uint256) {
return Multiply( numberA, numberB, numberC, numberD);
}
}
Here, the storage
variables are internal
(default attribute in Solidity), so no automatic getter will be added by the compiler.
And we indeed have 6 functions listed in the ABI JSON. The 6 following public
functions with their dedicated signatures:
Fonctions | Signatures | Selectors |
---|---|---|
storeA(uint256 num) public |
storeA(uint256) |
C534BE7A |
storeB(uint256 num) public |
storeB(uint256) |
9AE4B7D0 |
storeC(uint256 num) public |
storeC(uint256) |
4CF56E0C |
storeD(uint256 num) public |
storeD(uint256) |
B87C712B |
storeE(uint256 num) public |
storeE(uint256) |
E45F4CF5 |
retrieve() public view returns (uint256) |
retrieve() |
2E64CEC1 |
Based on the optimization level of the compiler, we get a different code for the "function dispatcher".
With a level of 200 (--optimize-runs 200
), we obtain the type of code generated previously, with its cascading if/else
statements.
tag 1
JUMPDEST
POP
PUSH 4
CALLDATASIZE
LT
PUSH [tag] 2
JUMPI
PUSH 0
CALLDATALOAD
PUSH E0
SHR
DUP1
PUSH 2E64CEC1
EQ
PUSH [tag] retrieve_0
JUMPI
DUP1
PUSH 4CF56E0C
EQ
PUSH [tag] storeC_uint256_0
JUMPI
DUP1
PUSH 9AE4B7D0
EQ
PUSH [tag] storeB_uint256_0
JUMPI
DUP1
PUSH B87C712B
EQ
PUSH [tag] storeD_uint256_0
JUMPI
DUP1
PUSH C534BE7A
EQ
PUSH [tag] storeA_uint256_0
JUMPI
DUP1
PUSH E45F4CF5
EQ
PUSH [tag] storeE_uint256_0
JUMPI
PUSH 0
DUP1
REVERT
However, with a higher level of runs
(--optimize-runs 300
)
tag 1
JUMPDEST
POP
PUSH 4
CALLDATASIZE
LT
PUSH [tag] 2
JUMPI
PUSH 0
CALLDATALOAD
PUSH E0
SHR
DUP1
PUSH B87C712B
GT
PUSH [tag] 9
JUMPI
DUP1
PUSH B87C712B
EQ
PUSH [tag] storeD_uint256_0
JUMPI
DUP1
PUSH C534BE7A
EQ
PUSH [tag] storeA_uint256_0
JUMPI
DUP1
PUSH E45F4CF5
EQ
PUSH [tag] storeE_uint256_0
JUMPI
PUSH 0
DUP1
REVERT
tag 9
JUMPDEST
DUP1
PUSH 2E64CEC1
EQ
PUSH [tag] retrieve_0
JUMPI
DUP1
PUSH 4CF56E0C
EQ
PUSH [tag] storeC_uint256_0
JUMPI
DUP1
PUSH 9AE4B7D0
EQ
PUSH [tag] storeB_uint256_0
JUMPI
tag 2
JUMPDEST
PUSH 0
DUP1
REVERT
The opcodes and the execution flow with --optimize-runs 300
are no longer the same, as shown in the following diagram.
It can be observed that the tests are "split" into two linear searches around a pivot value B87C712B
, thereby reducing consumption for the less favorable cases of storeB(uint256)
and storeE(uint256)
.
Only 4 tests for the functions storeB(uint256)
and storeE(uint256)
, instead of, respectively, 3 tests and 6 tests with the previous algorithm.
Determining the trigger for this type of optimization is more delicate; for example, the threshold for the number of functions happens to be 6 to trigger it with --optimize-runs 284
, providing two sets of 3 linear test series.
When the number of functions is less than 4, the selection process is done through linear search. However, with five or more functions, the compiler splits the processing based on its optimization parameter.
Tests on basic contracts with 4 to 15 functions, using optimizations from 200 to 1000 executions, have demonstrated these thresholds.
The following table (resulting from these tests) shows the number of splits, indicating the number of linear searches.
Record of the number of linear sequences based on runs level and the number of functions
( F : Number of functions / R : Runs level )
Are these thresholds (associated with runs
values) likely to evolve with subsequent versions of the solc
compiler?
Let's delve into an example for a contract with 11 functions to visualize the impact on gas consumption.
With 11 eligible functions and a higher runs
level of --optimize-runs 1000
, we transition from two ranges (one of 6 + one of 5) to four ranges (three of 3 + one of 2).
This time, I won't reproduce the opcodes and the associated diagram. To clarify the explanation, here is the execution flow in the form of pseudo-code, resembling code in the C language.
// [tag 1]
// 1 gas (JUMPDEST)
if( selector >= 0x799EBD70) { // 22 = (3+3+3+3+10) gas
if( selector >= 0xB9E9C35C) { // 22 = (3+3+3+3+10) gas
if( selector == 0xB9E9C35C) { goto storeF } // 22 = (3+3+3+3+10) gas
if( selector == 0xC534BE7A) { goto storeA } // 22 = (3+3+3+3+10) gas
if( selector == 0xE45F4CF5) { goto storeE } // 22 = (3+3+3+3+10) gas
revert()
}
// [tag 15]
// 1 gas (JUMPDEST)
if( selector == 0x799EBD70) { goto storeG } // 22 = (3+3+3+3+10) gas
if( selector == 0x9AE4B7D0) { goto storeB } // 22 = (3+3+3+3+10) gas
if( selector == 0xB87C712B) { goto storeD } // 22 = (3+3+3+3+10) gas
revert()
} else {
// [tag 14]
// 1 gas (JUMPDEST)
if( selector >= 0x4CF56E0C) { // 22 = (3+3+3+3+10) gas
if( selector == 0x4CF56E0C) { goto storeC } // 22 = (3+3+3+3+10) gas
if( selector == 0x6EC51CF6) { goto storeJ } // 22 = (3+3+3+3+10) gas
if( selector == 0x75A64B6D) { goto storeH } // 22 = (3+3+3+3+10) gas
revert()
}
// [tag 16]
// 1 gas (JUMPDEST)
if( selector == 0x183301E7) { goto storeI } // 22 = (3+3+3+3+10) gas
if( selector == 0x2E64CEC1) { goto retrieve } // 22 = (3+3+3+3+10) gas
revert()
}
The joints around the different "pivot" values are more clearly distinguished:
- With
799EBD70
as the first pivot value. - Then
0x4CF56E0C
&0xB9E9C35C
as secondary pivot values.
I used the code of a Solidity contract with 11 eligible functions for the "function dispatcher" as a reference to estimate the gas cost of the selection, depending on whether it's a linear or fractioned search.
It's only the cost of selection in the "function dispatcher" and not the execution of functions that is estimated. We don't concern ourselves with what the function does or how much gas it consumes, nor with the code that extracts the function's signature by fetching data from the calldata
area.
The estimation of gas costs for the used opcodes was done with the assistance of the following sites:
- Ethereum Yellow Paper (Berlin version)
- EVM Codes - An Ethereum Virtual Machine Opcodes Interactive Reference
The relevant opcodes in play for our purposes are as follows:
Mnemonic | Gas | Description |
---|---|---|
JUMPDEST |
1 | Mark valid jump destination. |
DUP1 |
3 | Clone 1st value on stack |
PUSH4 0xXXXXXXXX |
3 | Push 4-byte value onto stack. |
GT |
3 | Greater-than comparison. |
EQ |
3 | Equality comparison. |
PUSH [tag] |
3 | Push 2-byte value onto stack. |
JUMPI |
10 | Conditionally alter the program counter |
This allowed me to estimate the gas search costs for each function, for the threshold values of 200
and 1000
runs, thus leading to different processing, sequential for 200 runs
and "fractionated" for 1000 runs
.
Signatures | Selectors | Gas (linear) | Gas (splited) |
---|---|---|---|
storeI(uint256) |
183301E7 |
22 (min) | 69 |
retrieve() |
2E64CEC1 |
44 | 91 |
storeC(uint256) |
4CF56E0C (2) |
66 | 69 |
storeJ(uint256) |
6EC51CF6 |
88 | 90 |
storeH(uint256) |
75A64B6D |
110 | 112 (max) |
storeG(uint256) |
799EBD70 (1) |
132 | 68 |
storeB(uint256) |
9AE4B7D0 |
154 | 90 |
storeD(uint256) |
B87C712B |
176 | 112 (max) |
storeF(uint256) |
B9E9C35C (2) |
198 | 67 (min) |
storeA(uint256) |
C534BE7A |
220 | 89 |
storeE(uint256) |
E45F4CF5 |
242 (max) | 111 |
- (1): First pivot value for 1000 runs
- (2): Secondary pivot values for 1000 runs
If we take a closer look at the results of certain statistics on both types of search.
\ | Linear | Fractional |
---|---|---|
Min | 22 | 67 |
Max | 242 | 112 |
Average | 132 | 88 |
Deviation | 72,97 | 18,06 |
We observe significant differences. Specifically, a lower average (-33%) with a considerably lower standard deviation of consumptions (4 times less) in favor of the fractional processing.
Depending on the algorithm used by the Solidity compiler to generate the "function dispatcher," the processing order of functions will differ, both from the order of declaration in the source code and from the alphabetical order.
# | Signatures | Selectors |
---|---|---|
1 | storeI(uint256) |
183301E7 |
2 | retrieve() |
2E64CEC1 |
3 | storeC(uint256) |
4CF56E0C |
4 | storeJ(uint256) |
6EC51CF6 |
5 | storeH(uint256) |
75A64B6D |
6 | storeG(uint256) |
799EBD70 |
7 | storeB(uint256) |
9AE4B7D0 |
8 | storeD(uint256) |
B87C712B |
9 | storeF(uint256) |
B9E9C35C |
10 | storeA(uint256) |
C534BE7A |
11 | storeE(uint256) |
E45F4CF5 |
The number of tests and the complexity of the process are proportional to the number of functions, in O(n).
# | Signatures | Selectors |
---|---|---|
1 | storeF(uint256) |
B9E9C35C |
2 | storeG(uint256) |
799EBD70 |
3 | storeI(uint256) |
183301E7 |
4 | storeC(uint256) |
4CF56E0C |
5 | storeA(uint256) |
C534BE7A |
6 | storeJ(uint256) |
6EC51CF6 |
7 | storeB(uint256) |
9AE4B7D0 |
8 | retrieve() |
2E64CEC1 |
9 | storeE(uint256) |
E45F4CF5 |
10 | storeH(uint256) |
75A64B6D |
11 | storeD(uint256) |
B87C712B |
This is not a binary search in the strict sense of the term but rather a segmentation into groups of sequential tests around pivot values. However, in the end, the complexity is the same, in O(log n).
If we assume that functions are called fairly (at the same frequency of use), their calls will not cost the same based on their signatures (and therefore their names). It's clear that the cost of selecting a call to these functions, regardless of the algorithm, is highly heterogeneous, and while it can be estimated, it cannot be imposed.
However, by strategically renaming functions, adding suffixes, for example, you can influence the results of function signatures and, consequently, the gas costs associated with these functions. This practice can optimize gas consumption in your smart contract, not only during function selection in the EVM but also, as we will see later, during transactions.
The cost of a transaction consists of two parts: the intrinsic cost (including those related to the useful data of transactions) and the execution cost. Our optimizations focus on these two costs.
You can find more information on the breakdown of transaction costs on this page.
The combination of these two optimization approaches makes a significant difference by reducing gas consumption in smart contracts. This is particularly crucial in certain areas such as MEV (arbitrage) where optimization is vital.
To illustrate, modifying the function signature square(uint32)
to square_low(uint32)
changes the fingerprint to bde6cad1
instead of d27b3841
.
The lower value of the new fingerprint will prioritize the processing of calls to this function. This optimization can be crucial for highly complex smart contracts, reducing the time needed to search and select the correct function to call, resulting in gas savings and improved performance on the Ethereum blockchain.
The fact that the search is fractionated rather than linear complicates matters a bit. Depending on the number of functions and the compiler's optimization level, threshold values are more challenging to determine to choose new signatures based on the desired order.
When you send a transaction on the Ethereum blockchain, you include data specifying which function of the smart contract you want to call and what the arguments of that function are. The gas cost of a transaction partly depends on the number of zero bytes in the transaction data.
As specified in the Ethereum Yellow Paper (Berlin version),
Gtxdatazero
costs 4 gas for each zero byte in the transaction.Gtxdatanonzero
costs 16 gas for each non-zero byte, which is 4 times more expensive.
Thus, whenever a zero byte (00
) is used in msg.data
instead of a non-zero byte, it saves 12 gas.
This EVM characteristic also impacts the consumption of other opcodes like Gsset
and Gsreset
. To illustrate, modifying the function signature square(uint32)
to square_Y7i(uint32)
changes the fingerprint to 00001878
instead of d27b3841
.
The two most significant bytes of the fingerprint (0000
) not only prioritize the processing of the call to this function, as seen earlier, but also consume less gas during data retrieval (40 instead of 64).
Here are some additional examples:
Signatures (optimal) | Selectors (optimal) | Signatures | Selectors |
---|---|---|---|
deposit_ps2(uint256) |
0000fee6 | deposit(uint256) |
b6b55f25 |
mint_540(uint256) |
00009d1c | mint(uint256) |
a0712d68 |
b_1Y() |
00008e0c | b() |
4df7e3d0 |
Similarly, being able to use signatures with three zero-weight bytes allows for consuming only 28 gas.
For instance, deposit278591A(uint)
and deposit_3VXa0(uint256)
, with respective signatures 00000070
and 0000007e
, achieve this optimization.
However, given that there can be only a unique selection value (signature), there can be only one function in a contract with a signature that has four zero bytes, even though multiple signatures may lead to this optimized signature 00000000
, allowing for consuming only 16 gas (example with the following signature: execute_44g58pv()
).
Signatures | Selectors | # of zeros | Gas | Gain (gas) |
---|---|---|---|---|
execute() |
61461954 |
0 | 64 | 0 |
execute_5Hw() |
00af0043 |
1 | 52 | 8 |
execute_mAX() |
0000eb63 |
2 | 40 | 24 |
execute_6d4S() |
000000ae |
3 | 28 | 36 |
execute_44g58pv() |
00000000 |
4 | 16 | 48 |
I have developed Select0r, a tool written in Rust that allows you to rename your functions to optimize their calls. The program, given a function signature, will provide a list of alternative signatures with lower gas costs, enabling better ordering for the "function dispatcher."
-
Optimizing gas costs is a crucial aspect of designing efficient smart contracts on Ethereum.
-
By paying attention to details such as the order of function signatures, the number of leading zeros in the hash, the order of function processing, and function renaming, you can significantly reduce the costs associated with your contract.
-
However, be aware that this may reduce the user-friendliness and readability of your code.
-
Optimization for execution may not be necessary for so-called administrative functions or those infrequently called.
-
On the other hand, it should be prioritized for functions assumed to be the most frequently called (to be determined manually or statistically during practical tests).
-
A single optimization may seem insignificant, especially compared to the overall cost of a transaction. However, a set of optimizations performed on a series of transactions makes all the difference, and it's not limited to optimizations on the "function dispatcher."
In the end, these optimizations can make the difference between a cost-effective contract and one that is gas-expensive.
Credits: Franck Maussand
Special thanks to Igor Bournazel for his suggestions and proofreading of this article.
-
Hash function :
-
Keccak :
-
Binary search :
-
References :
- Ethereum Yellow Paper
- Opcodes for the EVM
- EVM Codes - An Ethereum Virtual Machine Opcodes Interactive Reference
- Operations with dynamic Gas costs
- Contract ABI Specification — Solidity 0.8.22 documentation
- Yul — Solidity 0.8.22 documentation
- Yul — Complete ERC20 Example
- Using the Compiler — Solidity 0.8.22 documentation
- The Optimizer — Solidity 0.8.22 documentation
-
Tools :
-
Misc :
- Function Dispatching | Huff Language
- Solidity’s Cheap Public Face
- Web3 Hacking: Paradigm CTF 2022 Writeup
- paradigm-ctf-2022/hint-finance at main · paradigmxyz/paradigm-ctf-2022 · GitHub
- GitHub - Laugharne/solc_runs_dispatcher
- WhatsABI? with Shazow - YouTube
- Ethereum Yellow Paper Walkthrough (4/7) - Transaction Execution