Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MCS Combining lock #666

Merged
merged 4 commits into from
Jun 28, 2024
Merged

Implement MCS Combining lock #666

merged 4 commits into from
Jun 28, 2024

Conversation

mjp41
Copy link
Member

@mjp41 mjp41 commented Jun 19, 2024

This is hybrid of Flat Combining and the MCS queue lock. It uses
the queue like the MCS queue lock, but each item additionally
contains a thunk to perform the body of the lock. This enables
other threads to perform the work that initially issued the request.

@mjp41 mjp41 force-pushed the lock_changes branch 3 times, most recently from d1f5f53 to ffa8fcb Compare June 20, 2024 21:11
@mjp41
Copy link
Member Author

mjp41 commented Jun 21, 2024

I attempted to integrate atomic::wait and notify_one, but it went much slower. So avoiding that for now. I don't think we hold the lock long enough that the cost is worth it.

@mjp41 mjp41 marked this pull request as ready for review June 21, 2024 12:50
@mjp41 mjp41 requested a review from nwf-msr June 21, 2024 12:50
This is hybrid of Flat Combining and the MCS queue lock. It uses
the queue like the MCS queue lock, but each item additionally
contains a thunk to perform the body of the lock. This enables
other threads to perform the work than initially issued the request.
This update adds a fast path flag for the uncontended case. This
reduces the number of atomic operations in the uncontended case.
// queue.
auto curr_c = curr;
if (lock.head.compare_exchange_strong(
curr_c, nullptr, std::memory_order_acq_rel))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure order should be Relaxed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks

void attach(CombiningLock& lock)
{
// Test if no one is waiting
if (lock.head.load(std::memory_order_relaxed) == nullptr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is beneficial to separate this region out as fast path subject to be inlined. The remaining parts can be put into a attach_slow function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the this code is used here enough to warrant that, but maybe I should for future use.

Copy link
Contributor

@nwf-msr nwf-msr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought for the future, and one bit of humor, but this LGTM. Took me a moment to understand again how the MCS lock's queue worked; one of these days it'll stick.

@@ -133,4 +133,11 @@ namespace snmalloc
lock.flag.store(false, std::memory_order_release);
}
};

template <typename F>
inline void with(FlagWord& lock, F&& f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not when? ;)

* might sort a collection of inserts, and perform them in a single traversal.
*
* Note that, we should perhaps add a Futex/WakeOnAddress mode to improve
* performance in the contended case, rather than spinning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could perhaps templatize the class on <bool notify = FALSE> or such?

@mjp41 mjp41 merged commit 6af38ac into microsoft:main Jun 28, 2024
47 checks passed
@mjp41 mjp41 deleted the lock_changes branch June 28, 2024 14:43
@SchrodingerZhu
Copy link
Contributor

Another interesting synchronization technique using on stack nodes is the HemLock from oracle. https://arxiv.org/pdf/2102.03863

@mjp41
Copy link
Member Author

mjp41 commented Jun 29, 2024

Thanks for the link. I hadn't seen that algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants