Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(composer)!: update to work with appside mempool #1643

Merged
merged 21 commits into from
Oct 14, 2024

Conversation

Lilyjjo
Copy link
Contributor

@Lilyjjo Lilyjjo commented Oct 8, 2024

Summary

Updates composer to work with the new appside mempool. New ABCI error codes were added and now nonces can be stacked, so the submission logic should use the GRPC pending_nonce() instead of the RPC get_latest_nonce().

Background

New ABCI error codes were added in PR #1515 which need to be handled in the composer's submission logic.

New Error codes:

  • NONCE_TAKEN -> try resubmitting with new top-of-pending. If this is happening because the composer is out of funds, that is okay as the composer has it's own fund monitoring logic
  • PARKED_FULL -> normal error case ok
  • ACCOUNT_SIZE_LIMIT -> normal error case ok
  • ALREADY_PRESENT -> normal error case ok, shouldn't be returned

Background

We changed the internals of the mempool which resulted in the users of the mempool to have changes as well.

Testing

Smoke tests

Related Issues

closes #1633

@github-actions github-actions bot added the composer pertaining to composer label Oct 8, 2024
@Lilyjjo Lilyjjo marked this pull request as ready for review October 8, 2024 12:49
@Lilyjjo Lilyjjo requested a review from a team as a code owner October 8, 2024 12:49
Copy link
Contributor

@itamarreif itamarreif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add a black box test for this new error. See existing test for INVALID_NONCE:

https://github.com/astriaorg/astria/blob/main/crates/astria-composer/tests/blackbox/geth_collector.rs#L86

Copy link
Member

@SuperFluffy SuperFluffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The abci error codes are only one part in addressing #1633.

This patch also needs to change how composer fetches the next nonce to use (since right it makes use of cometbft's get_nonce, which is not aware of the mempool).

@Lilyjjo Lilyjjo requested a review from a team as a code owner October 8, 2024 15:49
@github-actions github-actions bot added the cd label Oct 8, 2024
@Lilyjjo
Copy link
Contributor Author

Lilyjjo commented Oct 8, 2024

note: not ready for review yet

@Lilyjjo Lilyjjo changed the title fix(composer): handle new abci error codes fix(composer)!: update to work with appside mempool Oct 8, 2024
@Lilyjjo Lilyjjo force-pushed the lilyjjo/composer_abci_error_fix branch 2 times, most recently from c0b80ae to f6c5a05 Compare October 9, 2024 10:49
@Lilyjjo Lilyjjo self-assigned this Oct 9, 2024
@@ -8,7 +8,8 @@ data:
ASTRIA_COMPOSER_API_LISTEN_ADDR: "0.0.0.0:{{ .Values.ports.healthApi }}"
ASTRIA_COMPOSER_GRPC_ADDR: "0.0.0.0:{{ .Values.ports.grpc }}"
ASTRIA_COMPOSER_SEQUENCER_CHAIN_ID: "{{ tpl .Values.config.sequencerChainId . }}"
ASTRIA_COMPOSER_SEQUENCER_URL: "{{ tpl .Values.config.sequencerRpc . }}"
ASTRIA_COMPOSER_SEQUENCER_HTTP_URL: "{{ tpl .Values.config.sequencerRpc . }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we call this COMETBFT_ENDPOINT in other places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the name in the code as well

})
.with_config(retry_config)
.await
.wrap_err("failed getting latest nonce from sequencer after 1024 attempts");
.wrap_err("failed getting pending nonce from sequencing after 1024 attempts");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"from sequencer" instead of "from sequencing"?

@@ -127,18 +99,57 @@ async fn invalid_nonce_causes_resubmission_under_different_nonce() {

// wait for 1 sequencer block time to make sure the bundle is preempted
tokio::time::timeout(
Duration::from_millis(test_composer.cfg.block_time_ms),
Duration::from_millis(test_composer.cfg.block_time_ms + 1000),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably remove the + 1000 here?

}
}

pub(crate) async fn mount_pending_nonce_response(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably return a mock guard

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine if it doesn't. Ideally you provide the intermediate fetches (maybe with an expect4ed number of requests that match) and only have a mockguard on the actual output of the service (in this case, a transaction that contains a collection of sequence actions).

If you return a mock guard and then wait_until_satisfied it, you end up in a situation where you test the implementation (in which order are requests made?) rather than its behavior (does it produce the expected result?).

let expected_rollup_ids = vec![RollupId::from_unhashed_bytes("test1")];
// Expect nonce 1 again so that the resubmitted tx is accepted
let valid_nonce_guard =
mount_broadcast_tx_sync_mock(&test_composer.sequencer, expected_rollup_ids, vec![1]).await;

// Mount a response of 1 to a nonce query
test_composer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should get a mock guard from the mount helper and then wait for it to be satisfied

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted elsewhere: ideally we don't worry about the choreography of which RPCs are made in which order. It's best to just provide them and then try and observe what comes out of the service. Makes the tests less fragile.

Copy link
Contributor

@itamarreif itamarreif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pending nonce mock should use a mockguard and wait_until_satisfied as the mount is used several times in the test (e.g. for initializing nonce on startup and then on refetch after the invalid/taken nonce errors). besides that, just a couple nits on the error message and config variable names.

@Lilyjjo Lilyjjo force-pushed the lilyjjo/composer_abci_error_fix branch from bd37417 to f0e7f34 Compare October 9, 2024 13:37
@Lilyjjo Lilyjjo force-pushed the lilyjjo/composer_abci_error_fix branch from f0e7f34 to cfbf479 Compare October 9, 2024 13:39
Copy link
Member

@SuperFluffy SuperFluffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to have a few (user facing) names changed, but otherwise this looks great.

}
}

pub(crate) async fn mount_pending_nonce_response(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine if it doesn't. Ideally you provide the intermediate fetches (maybe with an expect4ed number of requests that match) and only have a mockguard on the actual output of the service (in this case, a transaction that contains a collection of sequence actions).

If you return a mock guard and then wait_until_satisfied it, you end up in a situation where you test the implementation (in which order are requests made?) rather than its behavior (does it produce the expected result?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@itamarreif Can you please go over these and provide feedback if anything that was tested here is not in some way covered in the blackbox tests?

I agree with removing the tests here (it should have been done a long time ago), rather than continuing to fix them.

@@ -25,7 +25,10 @@ NO_COLOR=
ASTRIA_COMPOSER_API_LISTEN_ADDR="0.0.0.0:0"

# Address of the RPC server for the sequencer chain
ASTRIA_COMPOSER_SEQUENCER_URL="http://127.0.0.1:26657"
ASTRIA_COMPOSER_COMETBFT_ENDPOINT="http://127.0.0.1:26657"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ASTRIA_COMPOSER_COMETBFT_ENDPOINT="http://127.0.0.1:26657"
ASTRIA_COMPOSER_SEQUENCER_ABCI_ENDPOINT="http://127.0.0.1:26657"

@@ -27,7 +27,10 @@ pub struct Config {
pub api_listen_addr: SocketAddr,

/// Address of the RPC server for the sequencer chain
pub sequencer_url: String,
pub cometbft_endpoint: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the comment on the env.example - to be consistent we should call this sequencer_abci_endpoint to distinguish between it and the grpc endpoint.

ASTRIA_COMPOSER_SEQUENCER_URL="http://127.0.0.1:26657"
ASTRIA_COMPOSER_COMETBFT_ENDPOINT="http://127.0.0.1:26657"

# Address of the RPC server for the sequencer chain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Address of the RPC server for the sequencer chain
# Address of the gRPC server for the sequencer chain

crates/astria-composer/src/executor/mod.rs Show resolved Hide resolved
@@ -517,7 +536,7 @@ async fn get_latest_nonce(
err,
)]
async fn submit_tx(
client: sequencer_client::HttpClient,
cometbft_client: sequencer_client::HttpClient,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nit: I think you could just keep this as client. In the context of this function there is no confusion.

But if you want to keep it, I think I'd lean toward calling this abci_client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this mock_abci_sequencer.rs

Status,
};

const GET_PENDING_NONCE_GRPC_NAME: &str = "get_pending_nonce";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a side note: the full path that is actually called as part of the gRPC request is "/astria.sequencerblock.v1alpha1.SequencerService/GetPendingNonce"

let expected_rollup_ids = vec![RollupId::from_unhashed_bytes("test1")];
// Expect nonce 1 again so that the resubmitted tx is accepted
let valid_nonce_guard =
mount_broadcast_tx_sync_mock(&test_composer.sequencer, expected_rollup_ids, vec![1]).await;

// Mount a response of 1 to a nonce query
test_composer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted elsewhere: ideally we don't worry about the choreography of which RPCs are made in which order. It's best to just provide them and then try and observe what comes out of the service. Makes the tests less fragile.

Copy link
Contributor

@itamarreif itamarreif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went over the deleted tests and made an issue for adding the coverage back as black box tests: #1652

/// Test to check that the executor sends a signed transaction to the sequencer as soon as it
/// receives a `SequenceAction` that fills it beyond its `max_bundle_size`.
#[tokio::test]
async fn full_bundle() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does not have an analogous black box test currently. However, there are unit tests on the BundleFactory that test for full bundles being fired off.

/// Test to check that the executor sends a signed transaction to the sequencer after its
/// `block_timer` has ticked
#[tokio::test]
async fn bundle_triggered_by_block_timer() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test also doesn't have an analog in the black box tests. We've tested this internally to the crate, in the executor module instead as this test assumes quite a bit of knowledge about the internal functions of the composer (specifically that if the bundle isn't filled then it will be submitted after the block timer).

/// Test to check that executor's chain ID check is properly checked against the sequencer's chain
/// ID
#[tokio::test]
async fn chain_id_mismatch_returns_error() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does not currently exist in the blackbox tests as it requires bubbling errors up out from the composer service.

/// Test to check that the executor sends a signed transaction with two sequence actions to the
/// sequencer.
#[tokio::test]
async fn two_seq_actions_single_bundle() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test also does not have an analog blackbox test currently. A test similar to the existing single_rollup_tx_payload_integrity that has two transactions should be added to cover this.

Copy link
Member

@SuperFluffy SuperFluffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

The tests in the executor module are in a weird state and should be part of the blackbox tests instead. As they are testing relatively low impact functionality as per @itamarreif's notes, I am in favor of removing them here and following up in a future patch.

@@ -30,6 +29,8 @@ data:
OTEL_EXPORTER_OTLP_TRACE_HEADERS: "{{ tpl .Values.otel.traceHeaders . }}"
OTEL_SERVICE_NAME: "{{ tpl .Values.otel.serviceName . }}"
{{- if not .Values.global.dev }}
ASTRIA_COMPOSER_SEQUENCER_ABCI_ENDPOINT: "{{ tpl .Values.config.sequencerRpc . }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when .Values.global.dev sets to true it marks a breaking change to configs. Therefore, new env variables should be added to else bracket while values being removed should be added to {{- if not .Values.global.dev }} bracket.

@Lilyjjo Lilyjjo force-pushed the lilyjjo/composer_abci_error_fix branch from baac1c3 to 5bdfddc Compare October 14, 2024 14:10
Copy link
Contributor

@quasystaty1 quasystaty1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Charts LGTM, small nit

charts/composer/templates/configmap.yaml Show resolved Hide resolved
Lilyjjo and others added 2 commits October 14, 2024 16:25
Co-authored-by: quasystaty <ido@astria.org>
@Lilyjjo Lilyjjo added this pull request to the merge queue Oct 14, 2024
Merged via the queue into main with commit acfd370 Oct 14, 2024
44 checks passed
@Lilyjjo Lilyjjo deleted the lilyjjo/composer_abci_error_fix branch October 14, 2024 14:57
steezeburger added a commit that referenced this pull request Oct 16, 2024
* main:
  feat(sequencer)!: rework all fees (#1647)
  fix(sequencer): fix app hash in horcrux sentries (#1646)
  fix(composer)!: update to work with appside mempool (#1643)
  fix(charts): rollup application monitoring (#1601)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cd composer pertaining to composer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

composer: handle new mempool ABCI error codes on transaction submission
4 participants