Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove mutex from trie operations #6526

Merged
merged 11 commits into from
Dec 18, 2024

Conversation

BeniaminDrasovean
Copy link
Contributor

@BeniaminDrasovean BeniaminDrasovean commented Oct 8, 2024

Reasoning behind the pull request

  • The trie accesses are serialized using a mutex on all of the public functions. While this assures that parallel calls will not result in data races, it also introduces a bottleneck.

Proposed changes

  • Remove the mutex from the public trie functions, and synchronize the accesses at the trie node level.
  • Remove GetAllHashes func from the trie. This was not used, and it could lead to oom if used for a large trie.
  • Refactor get, update and delete from the trie so that they can be called in parallel

Testing procedure

  • Unit tests
  • System tests when all the PRs are merged in the feat branch

Pre-requisites

Based on the Contributing Guidelines the PR author and the reviewers must check the following requirements are met:

  • was the PR targeted to the correct branch?
  • if this is a larger feature that probably needs more than one PR, is there a feat branch created?
  • if this is a feat branch merging, do all satellite projects have a proper tag inside go.mod?

…refactor-trie-mutex-usage

# Conflicts:
#	trie/branchNode.go
#	trie/branchNode_test.go
#	trie/extensionNode.go
#	trie/extensionNode_test.go
#	trie/interface.go
#	trie/leafNode.go
#	trie/leafNode_test.go
#	trie/patriciaMerkleTrie.go
…refactor-trie-mutex-usage

# Conflicts:
#	trie/branchNode_test.go
#	trie/extensionNode_test.go
#	trie/leafNode_test.go
#	trie/patriciaMerkleTrie.go
Base automatically changed from update-trie-on-goroutines to feat/trie-mutex-refactor December 9, 2024 09:30
Copy link

codecov bot commented Dec 9, 2024

Codecov Report

Attention: Patch coverage is 90.65421% with 30 lines in your changes missing coverage. Please review.

Project coverage is 79.94%. Comparing base (a5bc666) to head (63d7061).
Report is 24 commits behind head on feat/trie-mutex-refactor.

Files with missing lines Patch % Lines
trie/modifiedChildren.go 0.00% 13 Missing ⚠️
trie/branchNode.go 88.00% 6 Missing and 3 partials ⚠️
trie/extensionNode.go 92.00% 2 Missing and 2 partials ⚠️
trie/patriciaMerkleTrie.go 96.19% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@                    Coverage Diff                     @@
##           feat/trie-mutex-refactor    #6526    +/-   ##
==========================================================
  Coverage                     79.93%   79.94%            
==========================================================
  Files                           733      737     +4     
  Lines                         96458    96604   +146     
==========================================================
+ Hits                          77106    77227   +121     
- Misses                        13944    13981    +37     
+ Partials                       5408     5396    -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@BeniaminDrasovean BeniaminDrasovean marked this pull request as ready for review December 9, 2024 11:32
@sasurobert sasurobert self-requested a review December 11, 2024 07:24
@@ -110,3 +107,12 @@ type IdleNodeProvider interface {
IsIdle() bool
IsInterfaceNil() bool
}

type RootManager interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

defer en.childMutex.Unlock()

isChildCollapsed := en.child == nil && len(en.EncodedChild) != 0
if !isChildCollapsed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use en.isCollaped? exactly the same functions is wrote on top of this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@@ -249,12 +255,14 @@ func (tr *patriciaMerkleTrie) deleteBatch(data []core.TrieData) error {
return nil
}

if tr.root == nil {
rootNode := tr.GetRootNode()
if rootNode == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check.IfNil ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


if tr.root == nil {
rootNode := tr.GetRootNode()
if rootNode == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check.ifnil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mutOperation sync.RWMutex
}

func NewRootManager() *rootManager {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a lot of comments are missing here. on all the file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@ssd04 ssd04 self-requested a review December 12, 2024 16:38
Comment on lines 532 to 535
return nil
}

err := adb.removeCode(baseAcc)
if err != nil {
return err
}

err = adb.removeDataTrie(baseAcc)
if err != nil {
return err
}

return nil
}

func (adb *AccountsDB) removeDataTrie(baseAcc baseAccountHandler) error {
rootHash := baseAcc.GetRootHash()
if len(rootHash) == 0 {
return nil
}

dataTrie, err := adb.mainTrie.Recreate(rootHash)
if err != nil {
return err
}

hashes, err := dataTrie.GetAllHashes()
if err != nil {
return err
}

adb.obsoleteDataTrieHashes[string(rootHash)] = hashes

entry, err := NewJournalEntryDataTrieRemove(rootHash, adb.obsoleteDataTrieHashes)
if err != nil {
return err
}
adb.journalize(entry)

return nil
return adb.removeCode(baseAcc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename removeCodeAndDataTrie func? or remove it and use removeCode directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep track of this. When RemoveAccount() will be used, we need to implement an efficient way of marking the removed hashes for pruning. Added a TODO

@@ -201,22 +205,25 @@ func (tr *patriciaMerkleTrie) insertBatch(sortedDataForInsertion []core.TrieData
return nil
}

if tr.root == nil {
rootNode := tr.GetRootNode()
if rootNode == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check.IfNil ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +11 to +12
hash []byte
dirty bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we protect these fields by mutex in setters and getters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indded. Added TODO and will add mutex protection in a future PR (to minimize conflicts with already opened PRs)

Comment on lines +636 to +637
bn.mutex.Lock()
defer bn.mutex.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if setters in baseNode are protected by mutex, i think this protection here is not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed in order to keep all changes atomic. If I remove this there might be edgecases where different baseNode fields change from different goroutines.

Comment on lines +19 to +24
setDirty(bool)
isDirty() bool
getMarshalizer() marshal.Marshalizer
setMarshalizer(marshal.Marshalizer)
getHasher() hashing.Hasher
setHasher(hashing.Hasher)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can have a baseNode interface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 708 to +709
bn.hash = nil
bn.dirty = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if mutex protection is added in baseNode, cand we use setters here and avoid mutex in this file entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, all these operations need to happen atomically.

Comment on lines +177 to +178
ln.mutex.RLock()
defer ln.mutex.RUnlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need mutex here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this mutex is protecting hash and dirty fields, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It protects everything that can change on the node, so ln.Key & ln.Value also

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add unit tests for this file (or TODO for future PRs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the file as it is no longer used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add unit tests for this file (or TODO for future PRs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TODO

Comment on lines +25 to +26
func (bn *baseNode) isDirty() bool {
return bn.dirty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think rootManager is accessing this so it needs protection here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, will refactor mutex usage in a future PR

@BeniaminDrasovean BeniaminDrasovean merged commit be9a565 into feat/trie-mutex-refactor Dec 18, 2024
5 checks passed
@BeniaminDrasovean BeniaminDrasovean deleted the refactor-trie-mutex-usage branch December 18, 2024 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants