Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(starknet_l1_provider): add SoftDeleteIndexMap #3208

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

giladchase
Copy link
Contributor

Previous implementation inside transaction manager didn't account for
validate, the current implementation is more general, and can in the
past perhaps be reused elsewhere after some changes are made.

@reviewable-StarkWare
Copy link

This change is Reviewable

Copy link
Contributor

@matan-starkware matan-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @giladchase)


-- commits line 6 at r1:
?

Suggestion:

   future perhaps be reused elsewhere after some changes are made.

crates/starknet_l1_provider/src/test_utils.rs line 86 at r1 (raw file):

impl TransactionManagerContent {
    fn complete_to_tx_manager(mut self) -> TransactionManager {

TryInto?

Code quote:

    fn complete_to_tx_manager(mut self) -> TransactionManager {

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 15 at r1 (raw file):

#[derive(Clone, Debug, Default)]
pub struct SoftDeleteIndexMap {
    txs: IndexMap<TransactionHash, TransactionEntry>,

In principle what you are trying to achieve is a priority queue? It just so happens that our current priority is first come first serve which IndexMap gives us?

Is there any plan to move into a different form of selection?

Code quote:

    txs: IndexMap<TransactionHash, TransactionEntry>,

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 21 at r1 (raw file):

impl SoftDeleteIndexMap {
    /// Inserts a transaction into the map, returning the previous transaction if it existed.
    pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {

Suggestion:

    pub fn insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 23 at r1 (raw file):

    pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
        match self.txs.entry(tx.tx_hash) {
            Entry::Occupied(entry) => Some(entry.get().transaction.clone()),

Why don't we just call self.txs.insert?

IIUC this like does not actually insert the new tx into the map. Rather we assume that the tx_hash indicates the contents are the same and we we clone the contents of the old entry.

We intentionally ignore the TxState of the existing entry?

Code quote:

            Entry::Occupied(entry) => Some(entry.get().transaction.clone()),

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 32 at r1 (raw file):

    /// Soft delete and return a reference to the first unstaged transaction, by insertion order.
    pub fn soft_pop_back(&mut self) -> Option<&L1HandlerTransaction> {

Impl looks to me like we remove the first available element, not the last

Suggestion:

    pub fn soft_pop_front(&mut self) -> Option<&L1HandlerTransaction> {

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 65 at r1 (raw file):

    /// BTreeMap<u32, TransactionEntry>, Hashmap<TransactionHash, u32> and a counter: u32, such that
    /// every new tx is inserted to the map with key counter++ and the counter is not reduced
    /// when removing entries. Once the counter reaches u32::MAX/2 we recreate the DS in Theta(n).

This should not be in the public doc, switch to //

So my understanding:

  • N = tx_hashes.len()
  • M = self.txs.len()
  1. Build tx_hash set - O(N)
  2. Partition txs - O(M) since each contains check is O(1)
  3. rebuild txs - O(M-N)
  4. build committed - O(N)

Let's simplify this and say "Operation is linear time with both the number of known transactions and the number of committed transactions"

Code quote:

    /// Performance note: this method runs in Theta(n), since removing elements from indexmap
    /// requires shifting elements in O(n). Since we are removing multiple elements, recreating
    /// the indexmap is faster than removing each element individually.
    /// This is assumed to be good enough while l1-handler numbers remain low, but if this changes
    /// and we need log(n) removals (amortized), replace indexmap with this (basically a
    /// BTreeIndexMap):
    /// BTreeMap<u32, TransactionEntry>, Hashmap<TransactionHash, u32> and a counter: u32, such that
    /// every new tx is inserted to the map with key counter++ and the counter is not reduced
    /// when removing entries. Once the counter reaches u32::MAX/2 we recreate the DS in Theta(n).

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 82 at r1 (raw file):

    /// Rolls back all staged transactions, converting them to unstaged.
    pub fn _rollback(&mut self) {

Why are pub functions underscored? That's a python thing I thought.

I think of "rollback" being for a commit, not staging. I prefer either unstage or reset_staging, or rollback_staging.

Suggestion:

    pub fn unstage(&mut self) {

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 98 at r1 (raw file):

        SoftDeleteIndexMap { txs, ..Default::default() }
    }
}

We don't also want a new() fn?

Code quote:

impl From<Vec<L1HandlerTransaction>> for SoftDeleteIndexMap {
    fn from(txs: Vec<L1HandlerTransaction>) -> Self {
        let txs = txs.into_iter().map(|tx| (tx.tx_hash, TransactionEntry::new(tx))).collect();
        SoftDeleteIndexMap { txs, ..Default::default() }
    }
}

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 124 at r1 (raw file):

            self.state = TxState::Staged;
        }
    }

What about removing this and unstage and having a function set_state(TxState)?

Suggestion:

    pub fn stage(&mut self) {
        self.state = TxState::Staged;
    }

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 127 at r1 (raw file):

    pub fn is_available(&self) -> bool {
        matches!(self.state, TxState::Unstaged)

Not blocking, I just like to avoid macros when easy to.

Suggestion:

        match self.state {
            TxState::Unstaged => true,
            TxState::Staged => false,
        }

@giladchase giladchase force-pushed the gilad/soft-delete-index-map branch 2 times, most recently from d59f76d to 4f516ec Compare January 9, 2025 14:05
Copy link
Contributor Author

@giladchase giladchase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 6 files reviewed, 9 unresolved discussions (waiting on @matan-starkware)


-- commits line 6 at r1:

Previously, matan-starkware wrote…

?

Good lord 🙈 tnx


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 15 at r1 (raw file):

In principle what you are trying to achieve is a priority queue? It just so happens that our current priority is first come first serve which IndexMap gives us?

Ya for now first come first serve, don't know of any current plans of this changing any time soon.


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 23 at r1 (raw file):

IIUC this like does not actually insert the new tx into the map. Rather we assume that the tx_hash indicates the contents are the same and we we clone the contents of the old entry.

Yes I was assuming that tx_hash uniquely describes the L1TransactionHandler inside, since it is computed by having it's fields. In other words, unless someone adds a bug somewhere, the given tx should be identical to the stored one.

Added an assert to verify this.

We intentionally ignore the TxState of the existing entry?

Ya it shouldn't matter for insertion. In fact, this is one of those cases that shouldn't happen anyway, since our scraper never backtracks and there are no dups on L1, but better be safe and check it anyway.


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 32 at r1 (raw file):

Previously, matan-starkware wrote…

Impl looks to me like we remove the first available element, not the last

Arrrrg i also get those mixed up, thanks!


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 65 at r1 (raw file):

This should not be in the public doc, switch to //

Right, fixed.

Let's simplify this and say "Operation is linear time with both the number of known transactions and the number of committed transactions"

Done (if I understand correctly which part you wanted changed?)


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 82 at r1 (raw file):

Why are pub functions underscored? That's a python thing I thought.

Not related to the python privacy thing, even if it's pub it's still subject to the unused lint because the module is pub(crate) and not pub.
The _is to quiet it down without haveing to add a ignore lint thing

I think of "rollback" being for a commit, not staging. I prefer either unstage or reset_staging, or rollback_staging.

Changed to rollback_staging. WDYM by rollback being for a commit? I picked rollback since this staging behavior seems almost exactly like db transactions, where rollback behaves like reset_staging, as the opposite of commit, in which all operations in the session are undo'ed (only if commit has been called previously)


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 98 at r1 (raw file):

Previously, matan-starkware wrote…

We don't also want a new() fn?

Sure, added (also with _ because not used yet).


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 124 at r1 (raw file):

Previously, matan-starkware wrote…

What about removing this and unstage and having a function set_state(TxState)?

Done


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 127 at r1 (raw file):

Previously, matan-starkware wrote…

Not blocking, I just like to avoid macros when easy to.

death to macros


crates/starknet_l1_provider/src/test_utils.rs line 86 at r1 (raw file):

Previously, matan-starkware wrote…

TryInto?

Done (made it an Into since it can't fail)


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 21 at r1 (raw file):

impl SoftDeleteIndexMap {
    /// Inserts a transaction into the map, returning the previous transaction if it existed.
    pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {

Not used in this PR yet, will be used in subsequent commits I have in store, wanted to shave things off of this PR as much as possible.

I could have added a ignore(unused) in the file, but those tend to get forgotten, and the _ prefix is impossible to miss.
I can remove and add the ignore if you prefer though.

@giladchase giladchase force-pushed the gilad/split-files-2 branch from be0cec0 to a4102d8 Compare January 9, 2025 14:14
@giladchase giladchase force-pushed the gilad/soft-delete-index-map branch from 4f516ec to 0c7d6df Compare January 9, 2025 14:29
Base automatically changed from gilad/split-files-2 to main January 12, 2025 15:13
@giladchase giladchase force-pushed the gilad/soft-delete-index-map branch from 0c7d6df to 4d2d043 Compare January 12, 2025 15:14
Copy link
Contributor

@matan-starkware matan-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r3, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @giladchase)


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 21 at r1 (raw file):

Previously, giladchase wrote…

Not used in this PR yet, will be used in subsequent commits I have in store, wanted to shave things off of this PR as much as possible.

I could have added a ignore(unused) in the file, but those tend to get forgotten, and the _ prefix is impossible to miss.
I can remove and add the ignore if you prefer though.

Ok I missed that fact, yeah underscore is nicer.

Because this is already under review let's just move forward as is, but WDYT of in the future:

  1. You write the whole thing; full struct + tests + usage (I'm assuming you already did that)
  2. Sit with me to explain it - so I get the full context
  3. Break into PRs to avoid these unused function

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 23 at r1 (raw file):

Previously, giladchase wrote…

IIUC this like does not actually insert the new tx into the map. Rather we assume that the tx_hash indicates the contents are the same and we we clone the contents of the old entry.

Yes I was assuming that tx_hash uniquely describes the L1TransactionHandler inside, since it is computed by having it's fields. In other words, unless someone adds a bug somewhere, the given tx should be identical to the stored one.

Added an assert to verify this.

We intentionally ignore the TxState of the existing entry?

Ya it shouldn't matter for insertion. In fact, this is one of those cases that shouldn't happen anyway, since our scraper never backtracks and there are no dups on L1, but better be safe and check it anyway.

Ok


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 82 at r1 (raw file):

Previously, giladchase wrote…

Why are pub functions underscored? That's a python thing I thought.

Not related to the python privacy thing, even if it's pub it's still subject to the unused lint because the module is pub(crate) and not pub.
The _is to quiet it down without haveing to add a ignore lint thing

I think of "rollback" being for a commit, not staging. I prefer either unstage or reset_staging, or rollback_staging.

Changed to rollback_staging. WDYM by rollback being for a commit? I picked rollback since this staging behavior seems almost exactly like db transactions, where rollback behaves like reset_staging, as the opposite of commit, in which all operations in the session are undo'ed (only if commit has been called previously)

In my mind "rollback a commit" is a phrase so I like mentioning staging explicitly.


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 127 at r1 (raw file):

Previously, giladchase wrote…

death to macros

Hehe

I will admit that I love the derive macro


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 28 at r3 (raw file):

        match self.txs.entry(tx.tx_hash) {
            Entry::Occupied(entry) => {
                debug_assert_eq!(entry.get().transaction, tx);

I'd hold off on making this debug until:

  1. We have spent some time with the code and are confident there is no such bug
  2. We are worried about optimizing this path

Suggestion:

                assert_eq!(entry.get().transaction, tx);

crates/starknet_l1_provider/src/soft_delete_index_map.rs line 29 at r3 (raw file):

            Entry::Occupied(entry) => {
                debug_assert_eq!(entry.get().transaction, tx);
                Some(entry.get().transaction.clone())

avoid clone?

Suggestion:

                Some(tx)

@giladchase giladchase force-pushed the gilad/soft-delete-index-map branch from 4d2d043 to 806a96e Compare January 14, 2025 10:34
Copy link
Contributor Author

@giladchase giladchase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 9 files reviewed, 2 unresolved discussions (waiting on @matan-starkware)


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 21 at r1 (raw file):

Previously, matan-starkware wrote…

Ok I missed that fact, yeah underscore is nicer.

Because this is already under review let's just move forward as is, but WDYT of in the future:

  1. You write the whole thing; full struct + tests + usage (I'm assuming you already did that)
  2. Sit with me to explain it - so I get the full context
  3. Break into PRs to avoid these unused function

Sure.

Though in order to completely it'll mean PRs will get somewhat large, because whether I go bottom-up or top-down, you'll always hit unused issues unless the whole flow is added at once.

Another option which might make things simpler is to make everything pub while it's WIP, and after everything's merged we can do a really quick one-pass and add pub(crate)/remove-pub on everything, once all the usages are in.


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 23 at r1 (raw file):

Previously, matan-starkware wrote…

Ok

Done.


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 127 at r1 (raw file):

Previously, matan-starkware wrote…

Hehe

I will admit that I love the derive macro

Note that for matches! we have to fight clippy a bit, cause if I do something like:

match foo_out_of_many {
  Foo => true
  _ => false
}

clippy will tell me to use matches!.
You'll see it in a later PR when i match on ProviderState, I'll have to do:

match state {
   ProviderState::Uninitialized => true
   ProviderState::Pending | ProviderState::Validate | ProviderState::Proposer => false 
}

which is getting a bit noisy

Also ya derive's cool (except for the part about forcing derives on generic variables boooo)


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 29 at r3 (raw file):

Previously, matan-starkware wrote…

avoid clone?

Done (had to do an extra hash query to please the borrow checker)

Copy link
Contributor

@matan-starkware matan-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @giladchase)


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 21 at r1 (raw file):

Previously, giladchase wrote…

Sure.

Though in order to completely it'll mean PRs will get somewhat large, because whether I go bottom-up or top-down, you'll always hit unused issues unless the whole flow is added at once.

Another option which might make things simpler is to make everything pub while it's WIP, and after everything's merged we can do a really quick one-pass and add pub(crate)/remove-pub on everything, once all the usages are in.

Right, we don't want the entire project to be one big PR and block you sending. I was thinking more like this struct could be 1 big PR that you get right before breaking into many PRs for me to review. That scale make sense?


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 127 at r1 (raw file):

Previously, giladchase wrote…

Note that for matches! we have to fight clippy a bit, cause if I do something like:

match foo_out_of_many {
  Foo => true
  _ => false
}

clippy will tell me to use matches!.
You'll see it in a later PR when i match on ProviderState, I'll have to do:

match state {
   ProviderState::Uninitialized => true
   ProviderState::Pending | ProviderState::Validate | ProviderState::Proposer => false 
}

which is getting a bit noisy

Also ya derive's cool (except for the part about forcing derives on generic variables boooo)

Ok I am open to that latter one using matches! since that's verbose, but can we just do:

if state == ProviderState::Uninitialized {
    true
} else {
    false
}

I thought that when you match only 1 variant clippy recommends using an if not matches!


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 29 at r3 (raw file):

Previously, giladchase wrote…

Done (had to do an extra hash query to please the borrow checker)

If we assert that the new entry has the same set of transactions as the old entry, why do we bother returning it? The std HashMap returns the old entry because it may have different contents than the new one. Should we just return a bool instead to specify if this entry already exists?

If you are really set on returning the old entry how about this?

    pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
        match self.txs.entry(tx.tx_hash) {
            Entry::Occupied(entry) => {
                assert_eq!(entry.get().transaction, tx);
                Some(tx)
            }
            Entry::Vacant(entry) => {
                entry.insert(TransactionEntry::new(tx));
                None
            }
        }
    }

Returning an owned value is much nice. Since we just checked equality we may as well just return the "new" one cuz it's easy. If you want to be pedantic then:

    pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
        match self.txs.entry(tx.tx_hash) {
            Entry::Occupied(mut entry) => {
                assert_eq!(entry.get().transaction, tx);
                Some(std::mem::replace(&mut entry.get_mut().transaction, tx))
            }
            Entry::Vacant(entry) => {
                entry.insert(TransactionEntry::new(tx));
                None
            }
        }
    }

Previous implementation inside transaction manager didn't account for
validate, the current implementation is more general, and can in the
future perhaps be reused elsewhere after some changes are made.
@giladchase giladchase force-pushed the gilad/soft-delete-index-map branch from 806a96e to 295fa55 Compare January 14, 2025 19:21
Copy link

Artifacts upload workflows:

Copy link
Contributor Author

@giladchase giladchase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 10 files reviewed, 1 unresolved discussion (waiting on @matan-starkware)


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 21 at r1 (raw file):

Previously, matan-starkware wrote…

Right, we don't want the entire project to be one big PR and block you sending. I was thinking more like this struct could be 1 big PR that you get right before breaking into many PRs for me to review. That scale make sense?

Sure, i think this will be seamless: I'll open a PR with everything split into commits, once you TAL I can close the PR then SPR it, and immediately the commits will be converted to a stack.
Or alternatively we can just use per-commit view on reviewable and review each commit in order.


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 127 at r1 (raw file):

Previously, matan-starkware wrote…

Ok I am open to that latter one using matches! since that's verbose, but can we just do:

if state == ProviderState::Uninitialized {
    true
} else {
    false
}

I thought that when you match only 1 variant clippy recommends using an if not matches!

Got it. RE 1 variant, i think that's only for if let which will get recommended if the _ branch is (). But if the result is a bool then it recommends matches.


crates/starknet_l1_provider/src/soft_delete_index_map.rs line 29 at r3 (raw file):

Previously, matan-starkware wrote…

If we assert that the new entry has the same set of transactions as the old entry, why do we bother returning it? The std HashMap returns the old entry because it may have different contents than the new one. Should we just return a bool instead to specify if this entry already exists?

If you are really set on returning the old entry how about this?

    pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
        match self.txs.entry(tx.tx_hash) {
            Entry::Occupied(entry) => {
                assert_eq!(entry.get().transaction, tx);
                Some(tx)
            }
            Entry::Vacant(entry) => {
                entry.insert(TransactionEntry::new(tx));
                None
            }
        }
    }

Returning an owned value is much nice. Since we just checked equality we may as well just return the "new" one cuz it's easy. If you want to be pedantic then:

    pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
        match self.txs.entry(tx.tx_hash) {
            Entry::Occupied(mut entry) => {
                assert_eq!(entry.get().transaction, tx);
                Some(std::mem::replace(&mut entry.get_mut().transaction, tx))
            }
            Entry::Vacant(entry) => {
                entry.insert(TransactionEntry::new(tx));
                None
            }
        }
    }

Good point, bool is fine, added.

Copy link
Contributor

@matan-starkware matan-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @giladchase)

@giladchase giladchase added this pull request to the merge queue Jan 15, 2025
Merged via the queue into main with commit 53b06ea Jan 15, 2025
11 checks passed
@giladchase giladchase deleted the gilad/soft-delete-index-map branch January 15, 2025 14:52
@github-actions github-actions bot locked and limited conversation to collaborators Jan 17, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants