Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle connection better: make connectd know which peers are important #7630

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

rustyrussell
Copy link
Contributor

This series changes connectd to maintain reconnections, rather than the rather convoluted logic we had before. lightningd simply tells it which peers are "important", based on which peers have channels. We also are smarter about saving the last known address, if we have a successfully established an outgoing connection previously.

This lets connectd be smarter in overload, too, when it needs to select a peer to evict. And it's a chance to update our "startup" logic which tried to space out connections: we keep 10 in flight now, and wait until lightingd has forked off a subdaemon before adding more (though we always let through one a second, for corner cases).

These changes should help large nodes.

@rustyrussell rustyrussell added the Highlight - Stability and Security Refinement of basics, prevention and cures label Aug 31, 2024
@rustyrussell rustyrussell added this to the v24.11 milestone Aug 31, 2024
It's now trivial for us to do this ourselves, since we have gossmap.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Let lightningd feed us hints to try first, but we can extract the
addresses from node_announcement messages ourselves.

(Lightningd used to ask gossipd on our behalf: this is far simpler!)

One side effect of this is that we don't hand back address hints given to us
by lightningd: it would use these again for reconnecting.  This is breaks
test_sendpay_grouping, so we disable it temporarily.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Once connectd is controlling reconnections, it'll need these.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
…not just state.

We're going to use this to ask if there are any channels which make it
important to reconnect to the peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we connected out, remember that address.  We always remember the last
address, but that may be an incoming address.  This is explicitly the last
outgoing address which worked.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is more useful than the last address, which may be it connecting
to us.  And use it when we restore it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than have lightningd call us repeatedly to try to connect, have
it tell us what peers are transient and aren't, and connectd will
automatically try to maintain that connection.

There's a new "downgrade_peer" message to tell it a peer is now
transient: to make it non-transient we simply tell connectd to
connect as a non-transient.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: `connectd` now handles maintaining/reconnecting to important peers, and we remember the last successful address we connected to.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The important flag replaces it, and now we can be more intelligent about
eviction in overload.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We wait until a connection fails, or a subd is connected to the peer,
before letting another one through.  This should prevent us from
overwhelming lightningd on large nodes, but unlike the previous back-off,
it's based on how fast lightningd is, not an arbitrary time.

We also let one through each second, in case we're connecting to many,
but not doing anything but gossip (e.g. 100 explicit connect
commands).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Reconnecting to peers at startup should be significantly faster (dependent on machine speed).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@rustyrussell rustyrussell changed the title Handle connection better: make connectd know what are important Handle connection better: make connectd know which peers are important Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Highlight - Stability and Security Refinement of basics, prevention and cures
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant