-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(sequencer-relayer)!: minimize resubmissions to Celestia #1234
Conversation
…estia-resubmissions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great and I don't see any outright blockers, just a bunch of requests for stylistic changes and tightening naming (to make the use of types and states clearer on a first read).
A big request I have is to not omit the use of context in error handling. While an author might have reason to avoid the context (for example, because ample information is provided in the called function), it's not at all clear for a reviewer. :)
crates/astria-sequencer-relayer/src/relayer/celestia_client/mod.rs
Outdated
Show resolved
Hide resolved
crates/astria-sequencer-relayer/src/relayer/celestia_client/mod.rs
Outdated
Show resolved
Hide resolved
crates/astria-sequencer-relayer/src/relayer/celestia_client/mod.rs
Outdated
Show resolved
Hide resolved
crates/astria-sequencer-relayer/src/relayer/celestia_client/mod.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, tested locally
## Summary The relayer has been updated to avoid resubmitting the same data to Celestia on restart or on a timeout of the `BroadcastTx` gRPC. ## Background Currently the relayer will likely resubmit the same data when it restarts, since the majority of the time spent in the submit loop is waiting for confirmation of the `BlobTx` having been stored. While in this state, the on-disk files aren't updated to indicate that we're waiting for confirmation. Hence if we restart, the new session begins by submitting all data from the last-confirmed point, i.e. the same data which was already likely successfully submitted in the final attempt of the previous session. A similar situation happens if we timeout while waiting for the `BroadcastTx` response - the retry loop will attempt to resubmit the same data without first checking if the previous attempt succeeded. ## Changes * Replaced the two state files (presubmit and postsubmit) with a single one (submission-state) and similarly for their respective env vars. See [the updated spec](https://github.com/astriaorg/astria/blob/c110bdbf7d0fd05fba04775d07d046c37bbd7372/specs/sequencer-relayer.md#submission-state-file) for further details. * The `submission` module was heavily changed: there are now two primary state structs `StartedSubmission` and `PreparedSubmission`, between which the blob submitter toggles during normal operation. There is also `FreshSubmission` which is only relevant at startup, and an enum covering these three (`SubmissionStateAtStartup`) which is also only used during startup. Finally, the `State` enum is an ephemeral object only used to read/write the relevant state from/to disk. * `BlobSubmitter::run` was modified to try to confirm the last submission attempt from the previous session if the state file indicated the relayer exited while in `prepared` state. If that submission is confirmed (the most likely outcome), then the sequencer blocks in that submission are simply skipped in the write loop. (We could try to avoid even fetching these sequencer blocks, but that would be a significantly more pervasive change, and is probably not worth the complexity). * The relayer's celestia client was changed to split `try_submit` into `try_prepare` and `try_submit` so that the hash of the prepared `BlobTx` can be returned from `try_prepare` to be recorded in the state file before the transaction is broadcast to the Celestia app. * `submit_with_retry` was updated to check for a broadcast timeout error in the previous attempt, and in that case, attempts to just confirm that submission rather than automatically resubmitting the same data. ## Testing * Unit tests for the new `submission` types. * Black box test `later_height_in_state_leads_to_expected_relay` was updated. * Manually observed expected behaviour against a locally-running sequencer and Celestia app. ## Breaking Changelist * Removed `ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH` and `ASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH` config env vars. * Added `ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH` config env var. ## Related Issues Closes #1200.
* main: release: cut bridge withdrawer release (#1303) release: version cuts for dusk-9 (#1299) chore(core): remove ed25519_consensus from public API (#1277) chore: remove spurious entry in gitignore (#1276) chore(chart): Update EVM-Rollup Geth devTag (#1300) fix(proto)!: Change execution API to use primitive RollupId (#1291) refactor(core, proto)!: define bridge memos in proto (#1285) chore(sequencer-relayer)!: minimize resubmissions to Celestia (#1234)
Summary
The relayer has been updated to avoid resubmitting the same data to Celestia on restart or on a timeout of the
BroadcastTx
gRPC.Background
Currently the relayer will likely resubmit the same data when it restarts, since the majority of the time spent in the submit loop is waiting for confirmation of the
BlobTx
having been stored. While in this state, the on-disk files aren't updated to indicate that we're waiting for confirmation. Hence if we restart, the new session begins by submitting all data from the last-confirmed point, i.e. the same data which was already likely successfully submitted in the final attempt of the previous session.A similar situation happens if we timeout while waiting for the
BroadcastTx
response - the retry loop will attempt to resubmit the same data without first checking if the previous attempt succeeded.Changes
submission
module was heavily changed: there are now two primary state structsStartedSubmission
andPreparedSubmission
, between which the blob submitter toggles during normal operation. There is alsoFreshSubmission
which is only relevant at startup, and an enum covering these three (SubmissionStateAtStartup
) which is also only used during startup. Finally, theState
enum is an ephemeral object only used to read/write the relevant state from/to disk.BlobSubmitter::run
was modified to try to confirm the last submission attempt from the previous session if the state file indicated the relayer exited while inprepared
state. If that submission is confirmed (the most likely outcome), then the sequencer blocks in that submission are simply skipped in the write loop. (We could try to avoid even fetching these sequencer blocks, but that would be a significantly more pervasive change, and is probably not worth the complexity).try_submit
intotry_prepare
andtry_submit
so that the hash of the preparedBlobTx
can be returned fromtry_prepare
to be recorded in the state file before the transaction is broadcast to the Celestia app.submit_with_retry
was updated to check for a broadcast timeout error in the previous attempt, and in that case, attempts to just confirm that submission rather than automatically resubmitting the same data.Testing
submission
types.later_height_in_state_leads_to_expected_relay
was updated.Breaking Changelist
ASTRIA_SEQUENCER_RELAYER_PRE_SUBMIT_PATH
andASTRIA_SEQUENCER_RELAYER_POST_SUBMIT_PATH
config env vars.ASTRIA_SEQUENCER_RELAYER_SUBMISSION_STATE_PATH
config env var.Related Issues
Closes #1200.