chore(sequencer-relayer)!: minimize resubmissions to Celestia #1234

Fraser999 · 2024-07-02T22:22:08Z

Summary

The relayer has been updated to avoid resubmitting the same data to Celestia on restart or on a timeout of the BroadcastTx gRPC.

Background

Currently the relayer will likely resubmit the same data when it restarts, since the majority of the time spent in the submit loop is waiting for confirmation of the BlobTx having been stored. While in this state, the on-disk files aren't updated to indicate that we're waiting for confirmation. Hence if we restart, the new session begins by submitting all data from the last-confirmed point, i.e. the same data which was already likely successfully submitted in the final attempt of the previous session.

A similar situation happens if we timeout while waiting for the BroadcastTx response - the retry loop will attempt to resubmit the same data without first checking if the previous attempt succeeded.

Changes

Replaced the two state files (presubmit and postsubmit) with a single one (submission-state) and similarly for their respective env vars. See the updated spec for further details.
The submission module was heavily changed: there are now two primary state structs StartedSubmission and PreparedSubmission, between which the blob submitter toggles during normal operation. There is also FreshSubmission which is only relevant at startup, and an enum covering these three (SubmissionStateAtStartup) which is also only used during startup. Finally, the State enum is an ephemeral object only used to read/write the relevant state from/to disk.
BlobSubmitter::run was modified to try to confirm the last submission attempt from the previous session if the state file indicated the relayer exited while in prepared state. If that submission is confirmed (the most likely outcome), then the sequencer blocks in that submission are simply skipped in the write loop. (We could try to avoid even fetching these sequencer blocks, but that would be a significantly more pervasive change, and is probably not worth the complexity).
The relayer's celestia client was changed to split try_submit into try_prepare and try_submit so that the hash of the prepared BlobTx can be returned from try_prepare to be recorded in the state file before the transaction is broadcast to the Celestia app.
submit_with_retry was updated to check for a broadcast timeout error in the previous attempt, and in that case, attempts to just confirm that submission rather than automatically resubmitting the same data.

Testing

Unit tests for the new submission types.
Black box test later_height_in_state_leads_to_expected_relay was updated.
Manually observed expected behaviour against a locally-running sequencer and Celestia app.

Breaking Changelist

Removed ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH and ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH config env vars.
Added ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH config env var.

Related Issues

Closes #1200.

…estia-resubmissions

SuperFluffy

This looks great and I don't see any outright blockers, just a bunch of requests for stylistic changes and tightening naming (to make the use of types and states clearer on a first read).

A big request I have is to not omit the use of context in error handling. While an author might have reason to avoid the context (for example, because ample information is provided in the called function), it's not at all clear for a reviewer. :)

charts/sequencer-relayer/files/scripts/start-relayer.sh

crates/astria-sequencer-relayer/src/relayer/celestia_client/mod.rs

crates/astria-sequencer-relayer/src/relayer/write/mod.rs

crates/astria-sequencer-relayer/src/relayer/celestia_client/mod.rs

…estia-resubmissions

quasystaty1

LGTM, tested locally

charts/sequencer-relayer/templates/_helpers.tpl

## Summary The relayer has been updated to avoid resubmitting the same data to Celestia on restart or on a timeout of the `BroadcastTx` gRPC. ## Background Currently the relayer will likely resubmit the same data when it restarts, since the majority of the time spent in the submit loop is waiting for confirmation of the `BlobTx` having been stored. While in this state, the on-disk files aren't updated to indicate that we're waiting for confirmation. Hence if we restart, the new session begins by submitting all data from the last-confirmed point, i.e. the same data which was already likely successfully submitted in the final attempt of the previous session. A similar situation happens if we timeout while waiting for the `BroadcastTx` response - the retry loop will attempt to resubmit the same data without first checking if the previous attempt succeeded. ## Changes * Replaced the two state files (presubmit and postsubmit) with a single one (submission-state) and similarly for their respective env vars. See [the updated spec](https://github.com/astriaorg/astria/blob/c110bdbf7d0fd05fba04775d07d046c37bbd7372/specs/sequencer-relayer.md#submission-state-file) for further details. * The `submission` module was heavily changed: there are now two primary state structs `StartedSubmission` and `PreparedSubmission`, between which the blob submitter toggles during normal operation. There is also `FreshSubmission` which is only relevant at startup, and an enum covering these three (`SubmissionStateAtStartup`) which is also only used during startup. Finally, the `State` enum is an ephemeral object only used to read/write the relevant state from/to disk. * `BlobSubmitter::run` was modified to try to confirm the last submission attempt from the previous session if the state file indicated the relayer exited while in `prepared` state. If that submission is confirmed (the most likely outcome), then the sequencer blocks in that submission are simply skipped in the write loop. (We could try to avoid even fetching these sequencer blocks, but that would be a significantly more pervasive change, and is probably not worth the complexity). * The relayer's celestia client was changed to split `try_submit` into `try_prepare` and `try_submit` so that the hash of the prepared `BlobTx` can be returned from `try_prepare` to be recorded in the state file before the transaction is broadcast to the Celestia app. * `submit_with_retry` was updated to check for a broadcast timeout error in the previous attempt, and in that case, attempts to just confirm that submission rather than automatically resubmitting the same data. ## Testing * Unit tests for the new `submission` types. * Black box test `later_height_in_state_leads_to_expected_relay` was updated. * Manually observed expected behaviour against a locally-running sequencer and Celestia app. ## Breaking Changelist * Removed `ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH` and `ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH` config env vars. * Added `ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH` config env var. ## Related Issues Closes #1200.

* main: release: cut bridge withdrawer release (#1303) release: version cuts for dusk-9 (#1299) chore(core): remove ed25519_consensus from public API (#1277) chore: remove spurious entry in gitignore (#1276) chore(chart): Update EVM-Rollup Geth devTag (#1300) fix(proto)!: Change execution API to use primitive RollupId (#1291) refactor(core, proto)!: define bridge memos in proto (#1285) chore(sequencer-relayer)!: minimize resubmissions to Celestia (#1234)

minimize resubmission to celestia by relayer

9d9b340

Fraser999 requested review from a team as code owners July 2, 2024 22:22

Fraser999 requested review from noot and joroshiba July 2, 2024 22:22

github-actions bot added documentation Improvements or additions to documentation sequencer-relayer pertaining to the astria-sequencer-relayer crate cd labels Jul 2, 2024

Merge remote-tracking branch 'upstream/main' into fraser/minimize-cel…

a322921

…estia-resubmissions

SuperFluffy approved these changes Jul 23, 2024

View reviewed changes

Fraser999 added 2 commits July 23, 2024 19:58

several stylistic improvements to submission state code

9ade24c

Merge remote-tracking branch 'upstream/main' into fraser/minimize-cel…

183af9f

…estia-resubmissions

quasystaty1 approved these changes Jul 24, 2024

View reviewed changes

joroshiba reviewed Jul 24, 2024

View reviewed changes

charts/sequencer-relayer/templates/_helpers.tpl Show resolved Hide resolved

add comments

01643d9

Fraser999 added this pull request to the merge queue Jul 25, 2024

Merged via the queue into main with commit 961294c Jul 25, 2024
42 checks passed

Fraser999 deleted the fraser/minimize-celestia-resubmissions branch July 25, 2024 09:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(sequencer-relayer)!: minimize resubmissions to Celestia #1234

chore(sequencer-relayer)!: minimize resubmissions to Celestia #1234

Fraser999 commented Jul 2, 2024 •

edited by sync-by-unito bot

Loading

SuperFluffy left a comment

quasystaty1 left a comment

chore(sequencer-relayer)!: minimize resubmissions to Celestia #1234

chore(sequencer-relayer)!: minimize resubmissions to Celestia #1234

Conversation

Fraser999 commented Jul 2, 2024 • edited by sync-by-unito bot Loading

Summary

Background

Changes

Testing

Breaking Changelist

Related Issues

SuperFluffy left a comment

Choose a reason for hiding this comment

quasystaty1 left a comment

Choose a reason for hiding this comment

Fraser999 commented Jul 2, 2024 •

edited by sync-by-unito bot

Loading