Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Refactored the connection container lock to use a sync RwLock #2643

Merged
merged 4 commits into from
Nov 11, 2024

Conversation

barshaul
Copy link
Collaborator

@barshaul barshaul commented Nov 7, 2024

Core: Refactored the Connection Container Lock to Use a Synchronous RwLock

This PR refactors the connection container lock to use a synchronous RwLock instead of an asynchronous one, ensuring that the lock is held only briefly and not across await points. This change addresses potential deadlocks and improves overall lock acquisition time.

Background

In the redis-rs implementation, the entire request pipeline is blocked while the cluster client is in a recovery state. Specifically, when poll_flush calls poll_recover, the system waits for the recovery future to complete before processing other tasks. During this recovery period, some in-flight requests may already be running but remain pending. This creates a risk of deadlock if a request holds the connection container lock while the recovery future is waiting for it.

Why the Change was Made

Switching to a synchronous lock ensures that the recovery state cannot be entered while the connection container lock is held, thus preventing the recovery future from getting stuck. By preventing requests from holding the lock across await points, we reduce the risk of deadlocks.

@barshaul barshaul force-pushed the change_lock_to_sync branch 2 times, most recently from 68d8aab to b241c51 Compare November 7, 2024 13:31
@barshaul barshaul marked this pull request as ready for review November 7, 2024 13:59
@barshaul barshaul requested a review from a team as a code owner November 7, 2024 13:59
…ections

Signed-off-by: barshaul <barshaul@amazon.com>
Signed-off-by: barshaul <barshaul@amazon.com>
…wLock instead of an asynchronous one, and reducing lock acquisition time.

Signed-off-by: barshaul <barshaul@amazon.com>
Signed-off-by: barshaul <barshaul@amazon.com>
@barshaul barshaul merged commit b059785 into valkey-io:release-1.2 Nov 11, 2024
54 checks passed
avifenesh added a commit to avifenesh/valkey-glide that referenced this pull request Nov 11, 2024
…alkey-io#2643)

* Core: Refactored the connection container lock to use a synchronous RwLock instead of an asynchronous one, and reducing lock acquisition time.
* Core: Fixed redis-rs pubsub tests for server version >= 7.2.4

Signed-off-by: barshaul <barshaul@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants