Skip to content

Evolution of Ethereum: Future Upgrades

tr1sm0s1n edited this page May 22, 2024 · 7 revisions

Sharding to Dank-sharding

To deal with the time delays in transaction processing, Ethereum proposed sharding. The original idea of sharding was to create multiple parallel chains (shards) — each handling a section of the overall transactions occurring in Ethereum, connected to the main chain. Each shard will have a committee — a group of validators. The shards will be divided among the nodes so that each node does less work and the work gets done faster. The implementation is challenging and also takes considerable time to get implemented.

Meanwhile, smart contract developers also tried various methods to deal with scalability. It led to the evolution of Layer 2 solutions or L2. L2 solutions like rollups were successful in improving the transaction processing time. Their success was followed by an increased number of rollup proposals and witnessed a vast user base also.

How did rollups improve scalability?

Rollups are applications that run on top of Ethereum. They have a separate chain, a network of nodes and a consensus mechanism. The rollup chain executes the Ethereum transactions on its chain, “rolls” them up into one batch, and the results of transaction execution will be stored on Ethereum. Ethereum should have a provision to verify the validity of these transactions executed off-Ethereum chain. For this, rollups use cryptographic proofs. Zero Knowledge rollups (zk-rollups) use validity proofs, and optimistic rollups use fraud proofs. The rollup manages the expensive and computationally-intensive data processing and uses Ethereum only to store some data. It also uses calldata, the cheapest storage space available, for data storage. Rollups considerably reduce the transaction fee and time.

97% of the fees on a roll-up are related to calldata storage on Ethereum. The expense arises from all the Ethereum nodes processing it and persists on the chain forever. But rollups do not actually need this data on-chain for so long. After giving sufficient time for the provers to verify the rollup transaction, the data can be stored off-chain by rollup operators or other users.

If Ethereum data storage cost becomes lower, rollups can also become cheaper. Since sharding proposes an L1 scaling solution that alters the protocol, implementing it may take years. The Ethereum community acknowledged that rollups could provide quick results and improve once sharding is implemented. So the upgrade plans were revamped into rollup-centric designs. Consequently, sharding plans were also updated to Danksharding.

Danksharding

The danksharding refers to an updated sharding design proposed by a researcher, Dankrad Fiest. Danksharding supports rollups by providing a separate cheaper space for storing data from rollups. The complete implementation of danksharding is complex and time taking. The current sharding design EIP-4844, known as Proto-Danksharding, is an intermediary version of danksharding. It derives its name from Ethereum researchers Protolambda and Dankrad Fiest.

EIP-4844: Shard Blob Transactions

EIP-4844 introduces a new kind of transaction type which accepts “blobs” of data to be stored in the beacon node for a short time. Blob stands for "binary large objects", which refer to dynamic memory spaces. A blob can store 4096 elements of 32 bytes each, with a long-term maximum of 16 blobs per block, thereby allowing 4096 _ 32 bytes _ 16 per block = 2 MiB per block maximum. The blob cap per block can start low and grow over multiple network upgrades.

Proto-Danksharding says that the rollup data can be stored in blobs, and these blobs can be stored in the beacon nodes (consensus layer). Though they can be larger, they are relatively cheap to transact with as the consensus layer holds them rather than Ethereum’s computationally expensive execution layer. The data stored in the blob is needed for verifying the transaction execution by rollup. But it is not required to be available forever. So it will be automatically deleted after a specific period (1 to 3 months). The blob data is not accessible by EVM. This makes it cheaper.

Blob

The blob storage cost is measured using a separate metering mechanism called data gas; 1 byte = 1 data gas. Each data gas will be priced using a specific pricing logic.

The rollups are expected to post a commitment to their transaction data on-chain and the data itself within the blobs. The commitment will be accessible to EVM and persist, whereas the transaction data in the blobs will be periodically pruned. If blob data is to be stored forever, it will shoot up the storage requirements of the consensus client. This low-cost non-persistent storage option in the consensus client can help rollups to store their data at a lower cost, thereby reducing the transaction cost.

How can the blobs be verified?

Let us understand using a super simplified example. Imagine you have a list of data, say 100 items, <D0, D1, D2, …D100> which can be plotted in a graph and represented by the curve below.

Blob Graph 1

The commitment for this data will be a set of secret data points; let’s assume 4 data points <D12, D20, D78, D99>, which evaluates to the highlighted positions <P0, P1, P2, P3> in the curve. This commitment will be stored on the chain.

Blob Graph 2

The prover now tries to fit the same curve over the blob data. To verify it, the prover will re-calculate the position of the data points in the commitment. If any blob data has been changed, the curve will be different, and consequently, the commitment data points may not fit in the altered graph.

Blob Graph 3
The illustration corresponds to a simplified explanation for a beginner-level reader. The actual proposed procedure for blob transaction processing uses cryptographic commitment technique called KZG scheme which is an alternative to Merkle proofs. The KZG fits a polynomial equation to the data. The commitment evaluates the polynomial at some secret data points. A prover would fit the same polynomial over the data and evaluate it at the same values, checking that the result is the same.

Merkle tree to Verkle Trees

The Verge is one amongst the potential upgrades in the roadmap proposed by Ethereum co-founder Vitalik Buterin for revamping blockchain technology. It is set to address the scalability trilemma the blockchain platform continues to face. The “Verge” puts forward the idea of “Verkle tree” that is set to replace the erstwhile Merkle proofs/Merkle tree.

In a blockchain, there are numerous blocks lined up indicating the number of transactions verified and validated. This is only set to increase with more participants/nodes joining the blockchain network. So every block contains a set of data including all the details pertaining to the transactions. With the addition of more blocks into the blockchain, there is a proportional increase in the database too!

In order to manage these enormous databases and to keep the blockchain running, the “state” concept comes to the fore. The state includes the block data in a specialized format using the Merkle tree concept by keeping all accounts linked by hashes and finally reducing to a single root hash stored on the blockchain. This root hash or the Merkle proof poses as a stronger digital signature with a simplified verification procedure. The Merkle proofs not only ease the block verification procedure but also helps in reducing the disc space requirements for performing unhindered transactions.

But unknown to many there lies a hidden concern of bulging state size. According to the Ethereum Foundation, the state size and the Merkle Proofs together have exceeded 135+ gigabytes of storage space. In order to overcome this dilemma, the “ Verge” stage of Ethereum upgradation proposes the idea of the Verkle tree. Introduced by John Kuszmaul in 2018, the Verkle tree concept works similarly to the Merkle tree but here the proofs are shorter in size, and the proof calculation time is also considerably reduced.

Let us consider a simple example.

Generally, the Merkle tree is represented as a binary tree where the branching factor of nodes is 2. In the example discussed here, we will explore a set of 9 transactions belonging to a block added in a blockchain network using a k-ary Merkle tree and Verkle tree where k is the branching factor.

Consider the branching factor as 3. Initially the transactions are hashed using a cryptographic hash function. These hashed transactions are grouped into subsets, further generating a hash from the subsets until the root hash is computed. The root hash or Merkle proof denotes the entire block of transactions. The figure below shows how a Merkle tree is constructed.

[image]

As depicted in the figure, the hashes of sibling nodes of T3 i.e. Hash(T1) and Hash(T2) + concatenated hashes of T4 to T9 can cumulatively help in tracing if T3 belongs to the entire set of transactions. And here the root Hash [T123456789] is the Merkle proof that serves as the digest (footprint) for the entire set of transactions.

Unlike the Merkle tree which uses a cryptographic hash function to calculate the proof, the Verkle tree uses the Vector Commitment (VC) method for computing the same. The Vector Commitment is computed using the polynomial functions.

[image]

Thus for a Verkle Tree with a certain number of transactions, say, T1, T2, T3 …..T9 , the branching factor of the tree, k (here k = 3) is considered first. The Vector Commitment is then computed over each of these subsets. Here C1, C2, and C3 are the resulting Vector Commitments. The membership proofs πi for each transaction Ti in the subset is computed with respect to the Vector Commitment (note that i refers to the index position of the given transactions). The Vector Commitment C4 is computed over these three commitments along with their membership proofs π10 , π11 , and π12 respectively. Hence, C4 is the root commitment that forms the digest of the Verkle Tree.

To comprehend in a better way, consider Figure 2, here (T3,π3) means that transaction T3 exists in Commitment C1 and π3 is proof of this existence. Similarly (C1,π10) means that the Commitment C1 exists in the root Commitment C4 and π10 is proof of this existence.

Vector commitments are cryptographic techniques which allow committing to an ordered sequence of k values (t1, t2 . . . ,tk ) in such a way that one can later reveal one or many values at a specific position and prove it consistent with the initial commitment (for e.g., prove that ti is the i-th committed value) values. A Merkle tree is also a vector commitment, with the property that opening the i-th value requires a specific number of hashes as proof (Merkle proof). The Ethereum team plans the use of KZG polynomial commitment using elliptic-curve cryptography scheme to replace the hash functions used in the Merkle tree.

Thus for a tree with billion data storage points, making Merkle proofs cumulatively would require about 1 kilobyte, but in a Verkle tree the proof would be less than 150 bytes. With the addition of nodes, the Merkle proof calculations continue to expand leading to an increase in the depth of the Merkle tree. On the contrary, the proof size in the Verkle tree remains constant and the depth of the tree is considerably reduced.

So while a verifier has to look through multiple hash calculations at each level to reach the Merkle root, the proofs in the Verkle tree simplify the process by requiring to provide only a single proof at each level to reach the root Commitment, thereby proving that a transaction or a piece of data is part of a given set.

Proposer-Builder Separation (PBS)

Currently, ethereum blocks are built by the validators. Validators select the transactions to be included in a block, and the chosen block proposer validator creates the block, whereas the validator committee votes this proposed block.

Maximum Extractable Value (MEV) refers to validators maximizing their profitability by choosing the transaction ordering. Maximizing MEV requires sophisticated technical knowledge and custom software appended to normal validators. It is observed that most the institutional operators are better in MEV extraction compared to other validators. This implies that staking returns are likely to be higher with centralized operators.

PBS solves this problem by reconfiguring the economics of MEV. As per the proposer-builder separation, block builders are responsible for creating the blocks on behalf of the block proposer. There will be multiple block builders who propose the block, and the block proposer chooses one. The proposer pays a fee to the builder before sharing the block with the network.

PBS prevents transaction censorship at the protocol level by avoiding situations where the block builder is favouring certain transactions and disregarding some other set of transactions. This separation can enforce complex inclusion criteria that prevent censorship. For example, inclusion lists can be introduced so that when validators notice transactions which are not included in blocks, they can tag them to be mandatory in the next block. The block builders being specialized entities may be able to maximise the earnings from block rewards. They will be selecting the transactions from the mempool in such a way that the transaction fee and block reward earnings of validator will be maximum. These specialized block builders may also be able to compute the necessary data proofs for Danksharding.

Clone this wiki locally