From cddb32c7e2a70c51ec3880e5b5ffd4fb3a535187 Mon Sep 17 00:00:00 2001
From: Bradley Olson <34992650+BradleyOlson64@users.noreply.github.com>
Date: Sun, 16 Jun 2024 18:57:17 -0700
Subject: [PATCH 1/2] Brad updated slides (#1051)
* Cumulus slides updates
* Made edits to async backing slides
* Availability slides edits
* Updated images
* Ran prettier
---
...a-sharding-and-data-availability-slides.md | 50 ++++---
.../7-Polkadot/6-cumulus-deep-dive-slides.md | 134 +++++++++---------
.../7-asynchronous-backing-slides.md | 31 ++--
.../async_backing_1.svg | 2 +-
.../async_backing_2.svg | 2 +-
.../async_backing_3.svg | 2 +-
.../unincluded_segment_1.svg | 2 +-
.../unincluded_segment_2.svg | 2 +-
8 files changed, 115 insertions(+), 110 deletions(-)
diff --git a/syllabus/7-Polkadot/5-data-sharding-and-data-availability-slides.md b/syllabus/7-Polkadot/5-data-sharding-and-data-availability-slides.md
index 59f24239c..a387faeee 100644
--- a/syllabus/7-Polkadot/5-data-sharding-and-data-availability-slides.md
+++ b/syllabus/7-Polkadot/5-data-sharding-and-data-availability-slides.md
@@ -1,5 +1,5 @@
---
-title: Data Availability and Data Sharding Deep Dive
+title: Data Availability and Data Sharding
description: Data Availability Problem, Erasure coding, Data sharding.
duration: 1h
owner: Bradley Olson
@@ -23,23 +23,30 @@ owner: Bradley Olson
---
-## What Data Needs Availability?
+## What Data Needs Offchain Availability?
-Sharded Permanent Record: **Parachains!**
+Sharded Permanent Record? ❌ **Parachains!**
-Condensed Permanent Record: **Relay Chain!**
+Condensed Permanent Record? ❌ **Relay Chain!**
-Comprehensive 24 Hour Record: **Polkadot DA!**
+
+
+Comprehensive 24 Hour Record? ✅ **Polkadot DA!**
+
+
Notes:
-- Most data live solely on parachains
+- Most permanent data live solely on parachains
- Condensed data, hashes and commitments, stored on relay chain
-- DA secures heavy (MBs) information critical to the secure progression of parachains. Should be dropped from validators when old.
+- Polkadot DA secures proofs of validity. PoV's contain information critical to the secure progression of parachains. These are dropped from validators when old.
+- Way too large for on-chain storage!
+ - All other data added to relay chain per day: ~555M
+ - 40 PoVs per block for a day: ~72G
---
@@ -52,13 +59,10 @@ Incorrectness can be proven (merkle proofs), but unavailability can't.
Notes:
- You can't just hold a small number of nodes accountable for making some data available
-- Needs an off chain solution!
- - All other data added to relay chain per day: ~555M
- - 40 PoVs per block for a day: ~72G
---
-### Data Availability Problem: Parachains
+### Vulnerabilities With No DA: Malicious Collator
@@ -79,19 +83,19 @@ Solution:
---
-### Data Availability Problem: Relay Chain
+### Vulnerabilities With No DA: Malicious Backers
Notes:
-- Malicious backers could distribute invalid PoV to only malicious approval checkers
+- Malicious backers could distribute invalid PoV selectively to malicious approval checkers or not at all
- Really bad
- It means attackers could consistently finalize invalid parachain blocks with just a hand full of dishonest approval checkers
---
-### Data Availability Problem: Relay Chain
+### Vulnerabilities With No DA: Malicious Backers
@@ -132,12 +136,14 @@ Notes:
- Minimal unit of Polkadot execution scheduling
-- At most 1 candidate pending availability per relay block, per core
+- Cores abstract over several processing and storage resources
-- Considered "occupied" while a candidate paired with that core is pending availability
+- At most 1 candidate pending availability per relay block, per core
-- It saves resources to bundle signals about availability for all cores together
+- Considered "occupied" while a candidate paired with that core is pending availability
+- It saves resources to bundle signals about availability for all cores together
+
@@ -147,6 +153,14 @@ Notes:
+Notes:
+
+- Cores abstract over
+ - Backing processing
+ - DA processing
+ - DA storage
+ - On-Chain Inclusion
+
---
### Laying the Foundation: Erasure Coding
@@ -333,6 +347,8 @@ For any number $n$ of points $(x_i,y_i)$ there exists only one polynomial of deg
Notes:
+- We can mathematically derive the interpolating polynomial for any set of points.
+
Question: What are x_i and y_i wrt to our data?
---
diff --git a/syllabus/7-Polkadot/6-cumulus-deep-dive-slides.md b/syllabus/7-Polkadot/6-cumulus-deep-dive-slides.md
index d6c7d8c4f..bdeea8909 100644
--- a/syllabus/7-Polkadot/6-cumulus-deep-dive-slides.md
+++ b/syllabus/7-Polkadot/6-cumulus-deep-dive-slides.md
@@ -52,7 +52,8 @@ Notes:
- Substrate is a blockchain building framework
- But only "solo" chains
- Split into runtime/node side
-- Both Polkadot and Cumulus extend substrate
+- Polkadot is built using substrate
+- Cumulus extends substrate to allow any substrate chain to operate as a parachain
- Polkadot provides APIs to collators
---
@@ -140,12 +141,12 @@ Notes:
---
-### Import Driven Block Authoring
+### Key Process 1: Import Driven Block Authoring
-Collators are responsible for authoring new blocks, and they do so when importing relay blocks.
+Collators are responsible for authoring new blocks. Prior to the rollout of asynchronous backing, they did so when importing relay blocks.
Honest Collators will choose to author blocks descending from the best head.
-```rust[|3|4-8|11]
+```rust[3|4-8|11]
// Greatly simplified
loop {
let imported = import_relay_chain_blocks_stream.next().await;
@@ -165,19 +166,17 @@ loop {
Notes:
-- `parachain_trigger_block_authoring` itself can decide if it wants to build a block.
-- e.g. the parachain having a block time of 30 seconds
-- With asynchronous backing, parachain block authoring is untethered from relay block import.
+- With asynchronous backing, parachain block authoring is untethered from relay block import, though still ultimately reliant on it.
---
-### Finality
+### Key Process 2: Finality Updates
To facilitate shared security, parachains inherit their finality from the relay chain.
-```rust[|3|4-8|11]
+```rust[3|4-8|11]
// Greatly simplified
loop {
let finalized = finalized_relay_chain_blocks_stream.next().await;
@@ -194,9 +193,9 @@ loop {
---
-### Ensuring Block Availability
+### Key Process 3: Ensuring Block Availability
-As a part of the parachains protocol, Polkadot makes parachain blocks available for several hours after they are backed.
+As a part of the parachains protocol, Polkadot makes parachain blocks available for 24 hours after they are backed.
@@ -206,13 +205,13 @@ As a part of the parachains protocol, Polkadot makes parachain blocks available
- Malicious collator
+
+
+> What role does cumulus play?
Notes:
-- Approvers need the PoV to validate
-- Can't just trust backers to distribute the PoV faithfully
-- Malicious or faulty collators may advertise collations to validators without sharing them with other parachain nodes.
-- Cumulus is responsible for requesting missing blocks in the latter case
+- When Cumulus learns of a new parachain block via a receipt on the relay chain, it is responsible for deciding how long to wait before deciding that the block is missing and then requesting it from Polkadot DA.
---
@@ -252,14 +251,12 @@ Notes:
---
-# Collation Generation and Advertisement
+# Key Process 4: Collation Generation and Advertisement
---
## Collation Generation
-The last of our key processes
-
1. Relay node imports block in which parachain's avail. core was vacated
@@ -282,27 +279,9 @@ The last of our key processes
Notes:
-- First, sent to tethered relay node `CollationGeneration` subsystem to be repackaged and forwarded to backers
-- At least one backer responds, signing its approval
-- Triggers gossip of candidate to parachain node import queues
-
----
-
-#### Distribution in Code
-
-```rust[1|5]
-let result_sender = self.service.announce_with_barrier(block_hash);
-
-tracing::info!(target: LOG_TARGET, ?block_hash, "Produced proof-of-validity candidate.",);
-
-Some(CollationResult { collation, result_sender: Some(result_sender) })
-```
-
-Notes:
-
-- Prepares the announcement of a new parablock to peers with "announce_with_barrier"
-- Waits for green light from validator by sending it a "result_sender"
-- When validator sends positive result through sender, then the collator announces the block
+- New parablocks are communicated simultaneously in two ways
+ - A collation is sent to the collator's tethered relay node to be processed in the `CollationGeneration` subsystem. There it is repackaged before being forwarded to backers.
+ - An advertisement of the new parablock candidate is gossiped to parachain node import queues
---
@@ -334,7 +313,7 @@ Notes:
-- Building Blocks to make this possible, the PVF and PoV, are delivered within collations
+- The building blocks to make this possible, the PVF and PoV, are delivered within collations
@@ -394,7 +373,7 @@ The code is hashed and saved in the storage of the relay chain.
Notes:
-PVF not only contains the runtime, but also a function `validate_block` needed to interpret all the extra information in a PoV required for validation.
+The function `validate_block` needed to interpret all the extra information in a PoV required for validation.
This extra information is unique to each parachain and opaque to the relay chain.
---
@@ -431,7 +410,7 @@ The input of the runtime validation process is the PoV, and the function called
#### Validate Block in Code
-```rust [2|3-4|6|8-11]
+```rust [2|3-4|6|8-11|14]
// Very simplified
fn validate_block(input: InputParams) -> Output {
// First let's initialize the state
@@ -449,14 +428,14 @@ fn validate_block(input: InputParams) -> Output {
}
```
-
-
-> But where does `storage_proof` come from?
-
Notes:
-We construct the sparse in-memory database from the storage proof and
-then ensure that the storage root matches the storage root in the `parent_head`.
+1. We construct the sparse in-memory database from the storage proof and then ensure that the storage root matches the storage root in the `parent_head`.
+2. Replace host functions
+3. Execute block
+4. Create output. We check whether the `storage_root` and other outputs resulting from validation matched the commitments made by the collator.
+
+But where does `storage_proof` come from?
---
@@ -496,18 +475,20 @@ Notes:
Code highlighting:
-- CandidateCommitments: Messages passed upwards, Downward messages processed, New code (checked against validation outputs)
-- head_data & PoV (the validation inputs)
+- CandidateCommitments: Messages passed upwards, Downward messages processed, New code, `head_data` (checked against validation outputs)
+- PoV (the validation input)
---
-### Proof of Validity (Witness Data)
+### Witness Data (Storage Proof)
+- Makes up most if the information in a PoV
- Acts as a replacement for the parachain's pre-state for the purpose of validating a single block
-- It allows the reconstruction of a sparse in-memory merkle trie
-- State root can then be compared to that from parent header
+- It enables the construction of a sparse in-memory merkle trie
+- State root can then be compared to that from parent header
+
---
@@ -537,9 +518,9 @@ Notes:
---
-#### Parablock Validation in Summary
+#### Witness Data in Validation
-```rust [2|3-4|6]
+```rust [4]
// Very simplified
fn validate_block(input: InputParams) -> Output {
// First let's initialize the state
@@ -557,10 +538,10 @@ fn validate_block(input: InputParams) -> Output {
}
```
-- Now we know where the **storage_proof** comes from!
+- Now we know where the **storage_proof** (witness data) comes from!
- into_state constructs our storage trie
-- Host functions written to access this new storage
+- Host functions rewritten to access this new storage
---
@@ -571,21 +552,23 @@ fn validate_block(input: InputParams) -> Output {
- Every Substrate blockchain supports runtime upgrades
+- Every time a validator wants to validate a parablock, it must first compile the PVF
+
##### Problem
-
+
- What happens if PVF compilation takes too long?
-
+
- Approval no-shows
- In disputes neither side may reach super-majority
-
+
> Updating a Parachain runtime is not as easy as updating a standalone blockchain runtime
-
+
@@ -595,12 +578,10 @@ fn validate_block(input: InputParams) -> Output {
The relay chain needs a fairly hard guarantee that PVFs can be compiled within a reasonable amount of time.
-
-
- Collators execute a runtime upgrade but it is not applied
-- Collators send the new runtime code to the relay chain in a collation
+- Code sent in collation `Option`
- The relay chain executes the **PVF Pre-Checking Process**
- The first parachain block to be included after the end of the process applies the new runtime
@@ -618,17 +599,30 @@ https://github.com/paritytech/cumulus/blob/master/docs/overview.md#runtime-upgra
##### PVF Pre-Checking Process
+
+
+- Track
+- Check
+
+- Vote
+
+- Conclude
+
+- Upgrade
+
+- Notify
+
+
+
+
+Notes:
+
- The relay chain keeps track of all the new PVFs that need to be checked
- Each validator checks if the compilation of a PVF is valid and does not require too much time, then it votes
-
- binary vote: accept or reject
-
- Super majority concludes the vote
-
- The new PVF replaces the prior one in relay chain state
-
-
-Notes:
+- A "go ahead" signal is sent, telling the parachain to apply the upgrade
reference: https://paritytech.github.io/polkadot/book/pvf-prechecking.html
diff --git a/syllabus/7-Polkadot/7-asynchronous-backing-slides.md b/syllabus/7-Polkadot/7-asynchronous-backing-slides.md
index 62e3b0809..c6796ca46 100644
--- a/syllabus/7-Polkadot/7-asynchronous-backing-slides.md
+++ b/syllabus/7-Polkadot/7-asynchronous-backing-slides.md
@@ -39,15 +39,13 @@ Lets get to it
-> How is this synchronous?
-
-
-
Notes:
-- The dividing line between the left and right is when a candidate is backed on chain
-- Approvals, disputes, and finality don't immediately gate the production of farther candidates.
- So we don't need to represent those steps in this model.
+- In synchronous backing: Backing for block `N+1` starts after inclusion for block `N` is complete. Backing and inclusion cannot take place simultaneously for different candidates of the same parachain and neither can outpace the other.
+- White arrows represent execution flow
+- The tasks within each of these purple circles take place during the time of a single relay chain block.
+- Why might each of these two groupings of tasks need its own relay block to execute?
+- Approvals, disputes, and finality not represented. Why?
---
@@ -61,14 +59,10 @@ Notes:
-> How is this asynchronous?
-
-
-
Notes:
-- Our cache of parablock candidates allows us to pause just before that dividing line, on-chain backing
-- Why is backing asynchronous in this diagram?
+- Notice difference in white execution arrow. Backing repeats, looping over and over. How is this possible?
+- Our cache of backable parablock candidates allows inclusion to loop independently
---
@@ -143,6 +137,7 @@ Image version 1:
Image version 3:
+- The execution context for a new parablock requires its parablock parent and relay parent
- Whole process is a cycle of duration 12 seconds (2 relay blocks).
- No part of this cycle can be started for a second candidate of the same parachain until the first is included.
@@ -194,7 +189,7 @@ Image version 3:
Notes:
-Limitation example, upward messages remaining before the relay chain would have to drop incoming messages from our parachain
+Limitation example: upward messages remaining before the relay chain would have to drop incoming messages from our parachain
---
@@ -251,7 +246,7 @@ Notes:
- Fragment trees are rooted in relay chain active leaves
- Fragment trees built per scheduled parachain at each leaf
- Fragment trees may have 0 or more fragments representing potential parablocks making up possible futures for a parachain's state.
-- Collation generation, passing, and seconding work has already been completed for each fragment.
+- Collation generation and seconding work has already been completed for each fragment.
---
@@ -265,8 +260,8 @@ Notes:
Returning to our most basic diagram
-- Q: Which structure did I leave out the name of for simplicity, and where should that name go in our diagram?
-- Q: Which did I omit entirely?
+- Q: Which parablock storage structure did I leave out the name of for simplicity, and where should that name go in our diagram?
+- Q: Which parablock storage structure did I omit entirely?
---
@@ -278,7 +273,7 @@ Notes:
- What is exotic core scheduling?
- Multiple cores per parachain
- - Overlapping leases of many lengths
+ - Overlapping bulk coretime leases of varying lengths
- Lease + On-demand
- How does asynchronous backing help?
- The unincluded segment is necessary to build 2 or more parablocks in a single relay block
diff --git a/syllabus/7-Polkadot/assets/shallow-dive-asynchronous-backing/async_backing_1.svg b/syllabus/7-Polkadot/assets/shallow-dive-asynchronous-backing/async_backing_1.svg
index cd9e1b4a4..2403cd910 100644
--- a/syllabus/7-Polkadot/assets/shallow-dive-asynchronous-backing/async_backing_1.svg
+++ b/syllabus/7-Polkadot/assets/shallow-dive-asynchronous-backing/async_backing_1.svg
@@ -1 +1 @@
-