Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitor_notify_by_pid and monitor_unsubscribe may cause dead lock #270

Merged
merged 1 commit into from
Jun 5, 2024

Conversation

zmrush
Copy link

@zmrush zmrush commented May 24, 2024

monitor_notify_by_pid may be blocked by send msg to channel because bounded channel is full(but monitor lock has not be freed), on the other hand,monitor_unsubscribe may block because monitor lock is holded by the monitor_notify_by_pid

…ounded channel is full(but monitor lock has not be freed), on the other hand,monitor_unsubscribe may block because monitor lock is holded by the monitor_notify_by_pid
@github-actions github-actions bot added the C-shim Containerd shim label May 24, 2024
@zmrush
Copy link
Author

zmrush commented May 24, 2024

@kzys @dims @caniszczyk @tianon hi,can anyone see this commit?thanks

@zmrush
Copy link
Author

zmrush commented May 27, 2024

@mxpv can u check this commit?

@Burning1020
Copy link
Member

Burning1020 commented May 28, 2024

Can you find the reason why the monitor lock isn't being released? I think changing it to unbounded reduces the likelihood of this issue, but it doesn't real resolve it.

@zmrush
Copy link
Author

zmrush commented May 28, 2024

monitor_notify_by_pid method need MONITOR lock firstly, and then send msg to channel which is bounded,therefore sending msg may be blocked,this causes the MONITOR lock not released. Other methods may be like this: firstly get the lock then copy channels to a vector, then release the monitor lock, then send msg use the new built vector of channels, but it also maybe blocked because the bounded channel, of cause by this way the lock is released and there is no dead lock, but i think it also may be blocked because of the bounded channel. So i think change the bounded channel is always needed, which is implemented in the sync feature. @Burning1020

@Burning1020
Copy link
Member

That makes sense. If the receiver doesn't consume any object, unbounded channel may lead to OOM, but that's unlikely.

Copy link
Member

@Burning1020 Burning1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@mxpv mxpv added this pull request to the merge queue Jun 5, 2024
Merged via the queue into containerd:main with commit 6b225fb Jun 5, 2024
18 checks passed
@ningmingxiao
Copy link
Contributor

ningmingxiao commented Sep 29, 2024

I create another pr to fix this issue,we don't need to use use unbounded_channel(it doesn't real solve the problem).
#316

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-shim Containerd shim
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants