Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core Index Mismatch in Commitments and Descriptor #7107

Open
sw10pa opened this issue Jan 9, 2025 · 0 comments · May be fixed by #7104
Open

Core Index Mismatch in Commitments and Descriptor #7107

sw10pa opened this issue Jan 9, 2025 · 0 comments · May be fixed by #7104
Assignees
Labels
I2-bug The node fails to follow expected behavior. I10-unconfirmed Issue might be valid, but it's not yet known.

Comments

@sw10pa
Copy link
Member

sw10pa commented Jan 9, 2025

Description

A bug was identified during the implementation of the malus collator using undying collator (PR #6924). The following error message was observed from the collation-generation subsystem when the normal (non-malus) undying collator attempted to generate and submit collations to 3 assigned cores:

ERROR tokio-runtime-worker parachain::collation-generation: Failed to construct and distribute collation: V2 core index check failed: The core index in commitments doesn't match the one in descriptor.

The issue arises because the current code provides core indexes sequentially from the claim queue for the descriptor, while the commitments can include core indexes determined by the parachain using a core selector from UMP signals, potentially in a different sequence. This mismatch leads to the observed error.

Steps to Reproduce

  • Set up a network with an undying collator;
  • Ensure that the parachain is assigned to 3 or more cores;
  • Configure the core indexes to be provided by UMP signals.

Why was this not detected earlier?

  • Test collators (e.g., adder and undying) were not using UMP signals, as they are optional;
  • Elastic scaling collators (e.g., slot-based collators) do not have this problem, as it was addressed in PR #5372;
  • UMP signals are a new thing and collators using collator_fn have not been tested much, as they should not be used in production.

Proposed Solution

  • Modify the logic for constructing the CandidateDescriptorV2 to get the core index from commitments (via UMP signals) instead of using sequential indexes got from the claim queue;
  • Ensure backward compatibility for parachains that do not use UMP signals, allowing the system to function as before in those cases;
  • Add a check to stop processing if the parachain selects the same core multiple times when multiple cores are assigned.
@sw10pa sw10pa added I2-bug The node fails to follow expected behavior. I10-unconfirmed Issue might be valid, but it's not yet known. labels Jan 9, 2025
@sw10pa sw10pa self-assigned this Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I2-bug The node fails to follow expected behavior. I10-unconfirmed Issue might be valid, but it's not yet known.
Projects
Status: Backlog
1 participant