-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(starknet_l1_provider): add SoftDeleteIndexMap
#3208
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @giladchase)
Suggestion:
future perhaps be reused elsewhere after some changes are made.
crates/starknet_l1_provider/src/test_utils.rs
line 86 at r1 (raw file):
impl TransactionManagerContent { fn complete_to_tx_manager(mut self) -> TransactionManager {
TryInto?
Code quote:
fn complete_to_tx_manager(mut self) -> TransactionManager {
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 15 at r1 (raw file):
#[derive(Clone, Debug, Default)] pub struct SoftDeleteIndexMap { txs: IndexMap<TransactionHash, TransactionEntry>,
In principle what you are trying to achieve is a priority queue? It just so happens that our current priority is first come first serve which IndexMap gives us?
Is there any plan to move into a different form of selection?
Code quote:
txs: IndexMap<TransactionHash, TransactionEntry>,
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 21 at r1 (raw file):
impl SoftDeleteIndexMap { /// Inserts a transaction into the map, returning the previous transaction if it existed. pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
Suggestion:
pub fn insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 23 at r1 (raw file):
pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> { match self.txs.entry(tx.tx_hash) { Entry::Occupied(entry) => Some(entry.get().transaction.clone()),
Why don't we just call self.txs.insert
?
IIUC this like does not actually insert the new tx into the map. Rather we assume that the tx_hash
indicates the contents are the same and we we clone the contents of the old entry.
We intentionally ignore the TxState of the existing entry?
Code quote:
Entry::Occupied(entry) => Some(entry.get().transaction.clone()),
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 32 at r1 (raw file):
/// Soft delete and return a reference to the first unstaged transaction, by insertion order. pub fn soft_pop_back(&mut self) -> Option<&L1HandlerTransaction> {
Impl looks to me like we remove the first available element, not the last
Suggestion:
pub fn soft_pop_front(&mut self) -> Option<&L1HandlerTransaction> {
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 65 at r1 (raw file):
/// BTreeMap<u32, TransactionEntry>, Hashmap<TransactionHash, u32> and a counter: u32, such that /// every new tx is inserted to the map with key counter++ and the counter is not reduced /// when removing entries. Once the counter reaches u32::MAX/2 we recreate the DS in Theta(n).
This should not be in the public doc, switch to //
So my understanding:
- N = tx_hashes.len()
- M = self.txs.len()
- Build tx_hash set - O(N)
- Partition txs - O(M) since each contains check is O(1)
- rebuild txs - O(M-N)
- build committed - O(N)
Let's simplify this and say "Operation is linear time with both the number of known transactions and the number of committed transactions"
Code quote:
/// Performance note: this method runs in Theta(n), since removing elements from indexmap
/// requires shifting elements in O(n). Since we are removing multiple elements, recreating
/// the indexmap is faster than removing each element individually.
/// This is assumed to be good enough while l1-handler numbers remain low, but if this changes
/// and we need log(n) removals (amortized), replace indexmap with this (basically a
/// BTreeIndexMap):
/// BTreeMap<u32, TransactionEntry>, Hashmap<TransactionHash, u32> and a counter: u32, such that
/// every new tx is inserted to the map with key counter++ and the counter is not reduced
/// when removing entries. Once the counter reaches u32::MAX/2 we recreate the DS in Theta(n).
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 82 at r1 (raw file):
/// Rolls back all staged transactions, converting them to unstaged. pub fn _rollback(&mut self) {
Why are pub functions underscored? That's a python thing I thought.
I think of "rollback" being for a commit, not staging. I prefer either unstage or reset_staging, or rollback_staging.
Suggestion:
pub fn unstage(&mut self) {
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 98 at r1 (raw file):
SoftDeleteIndexMap { txs, ..Default::default() } } }
We don't also want a new()
fn?
Code quote:
impl From<Vec<L1HandlerTransaction>> for SoftDeleteIndexMap {
fn from(txs: Vec<L1HandlerTransaction>) -> Self {
let txs = txs.into_iter().map(|tx| (tx.tx_hash, TransactionEntry::new(tx))).collect();
SoftDeleteIndexMap { txs, ..Default::default() }
}
}
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 124 at r1 (raw file):
self.state = TxState::Staged; } }
What about removing this and unstage and having a function set_state(TxState)
?
Suggestion:
pub fn stage(&mut self) {
self.state = TxState::Staged;
}
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 127 at r1 (raw file):
pub fn is_available(&self) -> bool { matches!(self.state, TxState::Unstaged)
Not blocking, I just like to avoid macros when easy to.
Suggestion:
match self.state {
TxState::Unstaged => true,
TxState::Staged => false,
}
d59f76d
to
4f516ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 6 files reviewed, 9 unresolved discussions (waiting on @matan-starkware)
Previously, matan-starkware wrote…
?
Good lord 🙈 tnx
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 15 at r1 (raw file):
In principle what you are trying to achieve is a priority queue? It just so happens that our current priority is first come first serve which IndexMap gives us?
Ya for now first come first serve, don't know of any current plans of this changing any time soon.
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 23 at r1 (raw file):
IIUC this like does not actually insert the new tx into the map. Rather we assume that the
tx_hash
indicates the contents are the same and we we clone the contents of the old entry.
Yes I was assuming that tx_hash uniquely describes the L1TransactionHandler inside, since it is computed by having it's fields. In other words, unless someone adds a bug somewhere, the given tx
should be identical to the stored one.
Added an assert to verify this.
We intentionally ignore the TxState of the existing entry?
Ya it shouldn't matter for insertion. In fact, this is one of those cases that shouldn't happen anyway, since our scraper never backtracks and there are no dups on L1, but better be safe and check it anyway.
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 32 at r1 (raw file):
Previously, matan-starkware wrote…
Impl looks to me like we remove the first available element, not the last
Arrrrg i also get those mixed up, thanks!
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 65 at r1 (raw file):
This should not be in the public doc, switch to
//
Right, fixed.
Let's simplify this and say "Operation is linear time with both the number of known transactions and the number of committed transactions"
Done (if I understand correctly which part you wanted changed?)
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 82 at r1 (raw file):
Why are pub functions underscored? That's a python thing I thought.
Not related to the python privacy thing, even if it's pub it's still subject to the unused
lint because the module is pub(crate)
and not pub
.
The _
is to quiet it down without haveing to add a ignore lint thing
I think of "rollback" being for a commit, not staging. I prefer either unstage or reset_staging, or rollback_staging.
Changed to rollback_staging
. WDYM by rollback being for a commit? I picked rollback since this staging behavior seems almost exactly like db transactions, where rollback
behaves like reset_staging, as the opposite of commit, in which all operations in the session are undo'ed (only if commit has been called previously)
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 98 at r1 (raw file):
Previously, matan-starkware wrote…
We don't also want a
new()
fn?
Sure, added (also with _
because not used yet).
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 124 at r1 (raw file):
Previously, matan-starkware wrote…
What about removing this and unstage and having a function
set_state(TxState)
?
Done
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 127 at r1 (raw file):
Previously, matan-starkware wrote…
Not blocking, I just like to avoid macros when easy to.
death to macros
crates/starknet_l1_provider/src/test_utils.rs
line 86 at r1 (raw file):
Previously, matan-starkware wrote…
TryInto?
Done (made it an Into since it can't fail)
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 21 at r1 (raw file):
impl SoftDeleteIndexMap { /// Inserts a transaction into the map, returning the previous transaction if it existed. pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
Not used in this PR yet, will be used in subsequent commits I have in store, wanted to shave things off of this PR as much as possible.
I could have added a ignore(unused)
in the file, but those tend to get forgotten, and the _
prefix is impossible to miss.
I can remove and add the ignore if you prefer though.
be0cec0
to
a4102d8
Compare
4f516ec
to
0c7d6df
Compare
0c7d6df
to
4d2d043
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r3, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @giladchase)
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 21 at r1 (raw file):
Previously, giladchase wrote…
Not used in this PR yet, will be used in subsequent commits I have in store, wanted to shave things off of this PR as much as possible.
I could have added a
ignore(unused)
in the file, but those tend to get forgotten, and the_
prefix is impossible to miss.
I can remove and add the ignore if you prefer though.
Ok I missed that fact, yeah underscore is nicer.
Because this is already under review let's just move forward as is, but WDYT of in the future:
- You write the whole thing; full struct + tests + usage (I'm assuming you already did that)
- Sit with me to explain it - so I get the full context
- Break into PRs to avoid these unused function
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 23 at r1 (raw file):
Previously, giladchase wrote…
IIUC this like does not actually insert the new tx into the map. Rather we assume that the
tx_hash
indicates the contents are the same and we we clone the contents of the old entry.Yes I was assuming that tx_hash uniquely describes the L1TransactionHandler inside, since it is computed by having it's fields. In other words, unless someone adds a bug somewhere, the given
tx
should be identical to the stored one.Added an assert to verify this.
We intentionally ignore the TxState of the existing entry?
Ya it shouldn't matter for insertion. In fact, this is one of those cases that shouldn't happen anyway, since our scraper never backtracks and there are no dups on L1, but better be safe and check it anyway.
Ok
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 82 at r1 (raw file):
Previously, giladchase wrote…
Why are pub functions underscored? That's a python thing I thought.
Not related to the python privacy thing, even if it's pub it's still subject to the
unused
lint because the module ispub(crate)
and notpub
.
The_
is to quiet it down without haveing to add a ignore lint thingI think of "rollback" being for a commit, not staging. I prefer either unstage or reset_staging, or rollback_staging.
Changed to
rollback_staging
. WDYM by rollback being for a commit? I picked rollback since this staging behavior seems almost exactly like db transactions, whererollback
behaves like reset_staging, as the opposite of commit, in which all operations in the session are undo'ed (only if commit has been called previously)
In my mind "rollback a commit" is a phrase so I like mentioning staging explicitly.
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 127 at r1 (raw file):
Previously, giladchase wrote…
death to macros
Hehe
I will admit that I love the derive
macro
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 28 at r3 (raw file):
match self.txs.entry(tx.tx_hash) { Entry::Occupied(entry) => { debug_assert_eq!(entry.get().transaction, tx);
I'd hold off on making this debug until:
- We have spent some time with the code and are confident there is no such bug
- We are worried about optimizing this path
Suggestion:
assert_eq!(entry.get().transaction, tx);
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 29 at r3 (raw file):
Entry::Occupied(entry) => { debug_assert_eq!(entry.get().transaction, tx); Some(entry.get().transaction.clone())
avoid clone?
Suggestion:
Some(tx)
4d2d043
to
806a96e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 9 files reviewed, 2 unresolved discussions (waiting on @matan-starkware)
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 21 at r1 (raw file):
Previously, matan-starkware wrote…
Ok I missed that fact, yeah underscore is nicer.
Because this is already under review let's just move forward as is, but WDYT of in the future:
- You write the whole thing; full struct + tests + usage (I'm assuming you already did that)
- Sit with me to explain it - so I get the full context
- Break into PRs to avoid these unused function
Sure.
Though in order to completely it'll mean PRs will get somewhat large, because whether I go bottom-up or top-down, you'll always hit unused issues unless the whole flow is added at once.
Another option which might make things simpler is to make everything pub
while it's WIP, and after everything's merged we can do a really quick one-pass and add pub(crate)/remove-pub on everything, once all the usages are in.
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 23 at r1 (raw file):
Previously, matan-starkware wrote…
Ok
Done.
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 127 at r1 (raw file):
Previously, matan-starkware wrote…
Hehe
I will admit that I love the
derive
macro
Note that for matches!
we have to fight clippy a bit, cause if I do something like:
match foo_out_of_many {
Foo => true
_ => false
}
clippy will tell me to use matches!.
You'll see it in a later PR when i match on ProviderState, I'll have to do:
match state {
ProviderState::Uninitialized => true
ProviderState::Pending | ProviderState::Validate | ProviderState::Proposer => false
}
which is getting a bit noisy
Also ya derive's cool (except for the part about forcing derives on generic variables boooo)
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 29 at r3 (raw file):
Previously, matan-starkware wrote…
avoid clone?
Done (had to do an extra hash query to please the borrow checker)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @giladchase)
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 21 at r1 (raw file):
Previously, giladchase wrote…
Sure.
Though in order to completely it'll mean PRs will get somewhat large, because whether I go bottom-up or top-down, you'll always hit unused issues unless the whole flow is added at once.
Another option which might make things simpler is to make everything
pub
while it's WIP, and after everything's merged we can do a really quick one-pass and add pub(crate)/remove-pub on everything, once all the usages are in.
Right, we don't want the entire project to be one big PR and block you sending. I was thinking more like this struct could be 1 big PR that you get right before breaking into many PRs for me to review. That scale make sense?
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 127 at r1 (raw file):
Previously, giladchase wrote…
Note that for
matches!
we have to fight clippy a bit, cause if I do something like:match foo_out_of_many { Foo => true _ => false }clippy will tell me to use matches!.
You'll see it in a later PR when i match on ProviderState, I'll have to do:match state { ProviderState::Uninitialized => true ProviderState::Pending | ProviderState::Validate | ProviderState::Proposer => false }which is getting a bit noisy
Also ya derive's cool (except for the part about forcing derives on generic variables boooo)
Ok I am open to that latter one using matches!
since that's verbose, but can we just do:
if state == ProviderState::Uninitialized {
true
} else {
false
}
I thought that when you match only 1 variant clippy recommends using an if
not matches!
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 29 at r3 (raw file):
Previously, giladchase wrote…
Done (had to do an extra hash query to please the borrow checker)
If we assert that the new entry has the same set of transactions as the old entry, why do we bother returning it? The std HashMap returns the old entry because it may have different contents than the new one. Should we just return a bool instead to specify if this entry already exists?
If you are really set on returning the old entry how about this?
pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
match self.txs.entry(tx.tx_hash) {
Entry::Occupied(entry) => {
assert_eq!(entry.get().transaction, tx);
Some(tx)
}
Entry::Vacant(entry) => {
entry.insert(TransactionEntry::new(tx));
None
}
}
}
Returning an owned value is much nice. Since we just checked equality we may as well just return the "new" one cuz it's easy. If you want to be pedantic then:
pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> {
match self.txs.entry(tx.tx_hash) {
Entry::Occupied(mut entry) => {
assert_eq!(entry.get().transaction, tx);
Some(std::mem::replace(&mut entry.get_mut().transaction, tx))
}
Entry::Vacant(entry) => {
entry.insert(TransactionEntry::new(tx));
None
}
}
}
Previous implementation inside transaction manager didn't account for validate, the current implementation is more general, and can in the future perhaps be reused elsewhere after some changes are made.
806a96e
to
295fa55
Compare
Artifacts upload workflows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 8 of 10 files reviewed, 1 unresolved discussion (waiting on @matan-starkware)
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 21 at r1 (raw file):
Previously, matan-starkware wrote…
Right, we don't want the entire project to be one big PR and block you sending. I was thinking more like this struct could be 1 big PR that you get right before breaking into many PRs for me to review. That scale make sense?
Sure, i think this will be seamless: I'll open a PR with everything split into commits, once you TAL I can close the PR then SPR it, and immediately the commits will be converted to a stack.
Or alternatively we can just use per-commit view on reviewable and review each commit in order.
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 127 at r1 (raw file):
Previously, matan-starkware wrote…
Ok I am open to that latter one using
matches!
since that's verbose, but can we just do:if state == ProviderState::Uninitialized { true } else { false }
I thought that when you match only 1 variant clippy recommends using an
if
notmatches!
Got it. RE 1 variant, i think that's only for if let
which will get recommended if the _
branch is ()
. But if the result is a bool then it recommends matches.
crates/starknet_l1_provider/src/soft_delete_index_map.rs
line 29 at r3 (raw file):
Previously, matan-starkware wrote…
If we assert that the new entry has the same set of transactions as the old entry, why do we bother returning it? The std HashMap returns the old entry because it may have different contents than the new one. Should we just return a bool instead to specify if this entry already exists?
If you are really set on returning the old entry how about this?
pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> { match self.txs.entry(tx.tx_hash) { Entry::Occupied(entry) => { assert_eq!(entry.get().transaction, tx); Some(tx) } Entry::Vacant(entry) => { entry.insert(TransactionEntry::new(tx)); None } } }
Returning an owned value is much nice. Since we just checked equality we may as well just return the "new" one cuz it's easy. If you want to be pedantic then:
pub fn _insert(&mut self, tx: L1HandlerTransaction) -> Option<L1HandlerTransaction> { match self.txs.entry(tx.tx_hash) { Entry::Occupied(mut entry) => { assert_eq!(entry.get().transaction, tx); Some(std::mem::replace(&mut entry.get_mut().transaction, tx)) } Entry::Vacant(entry) => { entry.insert(TransactionEntry::new(tx)); None } } }
Good point, bool is fine, added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @giladchase)
Previous implementation inside transaction manager didn't account for
validate, the current implementation is more general, and can in the
past perhaps be reused elsewhere after some changes are made.