-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(network): Rate-limit inbound connections per IP. #7041
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #7041 +/- ##
==========================================
- Coverage 77.53% 77.47% -0.07%
==========================================
Files 310 310
Lines 41933 41749 -184
==========================================
- Hits 32513 32343 -170
+ Misses 9420 9406 -14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some overall design advice:
This ticket and PR are about preventing repeated inbound handshakes from the same IP. (Or a small set of IPs.)
We already rate-limit these handshakes, and drop duplicates when they reach the peer set.
Here's what I think the remaining security issues are:
A single IP can have multiple inbound connections in flight between the inbound connection task and the peer set. (About 3-4 if it uses both timeouts fully.) These connections must be successful to be sent to the peer set.
A single IP can also deny service to other inbound peers by continually making connections, and hitting the 1 second rate-limit. These connections can be successful, failed, or time out. (So there's no guarantee they will reach the peer set.)
I should have written this in the ticket, but I hadn't thought it through, because I didn't realise it was going to be scheduled so soon.
So I'd suggest keeping a list of recent inbound connection IPs in the inbound connection task, and expiring them after a timeout.
Outbound peers and the PeerSet are not involved:
I don't think these security issues can be fixed by checking the connections tracked by the peer set or address book.
The AddressBook
only tracks addresses that are valid for outbound connections. The ports of inbound connections are ephemeral, so they aren't stored in the AddressBook
, because we can't make outbound connections to those ports.
The PeerSet
tracks both inbound and outbound connections, but only after they successfully complete a handshake.
It's also ok to leave any duplicate outbound/inbound IPs for the peer set to drop.
Related bugs:
We're currently fixing a concurrency bug in the inbound connection handler, so I'd like to avoid adding more locking to that function. Particularly when it's hidden in trait methods.
Edit: it turns out the bug was #7103, a future using 100% CPU in a busy-loop. So it's not related to locking.
I updated my review to remove the note about a locking bug, because it turned out to be incorrect. Edit: there's no need to work on this either just because I commented 😅 |
…in RecentByIp and account_inbound_connections Updates tests
493df40
to
7de63bb
Compare
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks so much for rewriting it! I really appreciate the simpler code.
I have some questions about how this code will handle an attack where Zebra is flooded with inbound connections from different IP addresses. It would be good to fix them here or in a quick follow-up.
I was also wondering about the PR title. I think what this code does now is rate-limit inbound connections to max_connections_per_ip
per MIN_PEER_RECONNECTION_DELAY
, rather than ignoring them completely?
(Which still fixes the original bug where the handshaker could get overloaded.)
Otherwise all my suggestions are optional.
As a follow-up, I think we might want to hide these IpAddr
and the ones in the AddressBook
from the logs, in the same way as PeerSocketAddr
. But I think that's a low priority, because we don't actually log those types or instrument their functions. So the IpAddr
s we've added won't appear in any logs yet.
Uses partition_point & split_off Moves tests to separate module
Co-authored-by: teor <teor@riseup.net>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is also an optional nitpick, I doubt most users will notice the difference between "Ignore for 2 minutes" and "rate-limit for 2 minutes". |
Motivation
This prevents some kinds of attacks by making Zebra skip the handshake for these inbound connections.
Closes #6981.
Solution
max_connections_per_ip
recent inbound connections from their IPReview
Part of regular scheduled work.
Reviewer Checklist