Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gossmap: ensure chan not null #7685

Closed

Conversation

JssDWt
Copy link
Contributor

@JssDWt JssDWt commented Sep 20, 2024

Ignore localmods that don't have a corresponding entry in the gossmap.

A crash was observed on this branch: https://github.com/breez/lightning/tree/cln-v24.08-breez with commit breez@bc9e4f5

pay: FATAL SIGNAL 11 (version v24.08-4-gbc9e4f5-modded)
0x5584c2da9cbf send_backtrace
        common/daemon.c:33
0x5584c2da9d44 crashdump
        common/daemon.c:75
0x7fc69664858f ???
        ???:0
0x5584c2dc2864 gossmap_remove_localmods
        common/gossmap.c:984
0x5584c2d94b2f put_gossmap
        plugins/libplugin-pay.c:62
0x5584c2d9ac32 routehint_step_cb
        plugins/libplugin-pay.c:3171
0x5584c2d98fda payment_continue
        plugins/libplugin-pay.c:2450
0x5584c2d99928 shadow_route_cb
        plugins/libplugin-pay.c:3529
0x5584c2d98fda payment_continue
        plugins/libplugin-pay.c:2450
0x5584c2d9b585 direct_pay_override
        plugins/libplugin-pay.c:3550
0x5584c2d9b7a8 direct_pay_listpeerchannels
        plugins/libplugin-pay.c:3621
0x5584c2d93713 handle_rpc_reply
        plugins/libplugin.c:1016
0x5584c2d938b7 rpc_read_response_one
        plugins/libplugin.c:1202
0x5584c2d93964 rpc_conn_read_response
        plugins/libplugin.c:1226
0x5584c2ef37cc next_plan
        ccan/ccan/io/io.c:60
0x5584c2ef3c57 do_plan
        ccan/ccan/io/io.c:422
0x5584c2ef3d10 io_ready
        ccan/ccan/io/io.c:439
0x5584c2ef55fc io_loop
        ccan/ccan/io/poll.c:455
0x5584c2d94006 plugin_main
        plugins/libplugin.c:2230
0x5584c2d8f029 main
        plugins/pay.c:1533
0x7fc696632c89 ???
        ???:0
0x7fc696632d44 ???
        ???:0
0x5584c2d8b7b0 ???
        ???:0
0xffffffffffffffff ???
        ???:0

The branch contains changes compared to v24.08, namely

But I don't think they were related to the crash.
A simple null check should suffice here?

Checklist

Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:

  • The changelog has been updated in the relevant commit(s) according to the guidelines.
  • Tests have been added or modified to reflect the changes.
  • Documentation has been reviewed and updated as needed.
  • Related issues have been listed and linked, including any that this PR closes.

Ignore localmods that don't have a corresponding
entry in the gossmap.

Changelog-Fixed: pay plugin crash with incomplete gossmap
@cdecker cdecker added this to the v24.08.2 milestone Sep 24, 2024
@TRIGEMTECH
Copy link

TRIGEMTECH commented Sep 27, 2024

Encountered similar issue after overnight power surge.
Is it safe to delete gossip_store?

Was able to get lightning started but appears stuck in infinite loop with following two lines repeating for one particular channel at the same offsets.
lightning_gossipd: gossmap: redundant channel_announce for ...!
lightning_connectd: gossmap: redundant channel_announce for ...!
Eventually the looping stops but resumes after several minutes.

The output occurs on line 471 in gossmap.c
warnx("gossmap: redundant channel_announce for %s, offsets %u and %zu!",

Added following lines as shown in single changed file and recompiled - no change.
982 if (chan == NULL)
983 continue;

---FINAL UPDATE---
Able to resolve my issue by reading a lot and by:

  1. shutting every thing down and rebooting system
  2. deleting gossip_store and gossip_store.corrupt
  3. restarting lightningd
    gossip_store has been re-created and based on its previous file size it appears it will take several hours to complete.
    The answer to the question 'Is it safe to delete gossip_store?' is yes.
    ---NO FURTHER ACTION REQUIRED---

Terminal output after original attempt at restarting lightningd shown below

lightning_gossipd: gossip_store: get delete entry offset 34356327/14528 (version v24.08.1-modded)
0x58da2a2455f3 send_backtrace
common/daemon.c:33
0x58da2a24f026 status_failed
common/status.c:221
0x58da2a23cb8b gossip_store_get_with_hdr
gossipd/gossip_store.c:466
0x58da2a23cc06 check_msg_type
gossipd/gossip_store.c:491
0x58da2a23cd99 gossip_store_set_flag
gossipd/gossip_store.c:509
0x58da2a23cfac gossip_store_del
gossipd/gossip_store.c:561
0x58da2a23e788 process_channel_update
gossipd/gossmap_manage.c:793
0x58da2a23f0a0 gossmap_manage_channel_update
gossipd/gossmap_manage.c:901
0x58da2a23b923 handle_recv_gossip
gossipd/gossipd.c:215
0x58da2a23ba12 connectd_req
gossipd/gossipd.c:307
0x58da2a245909 handle_read
common/daemon_conn.c:35
0x58da2a385a49 next_plan
ccan/ccan/io/io.c:60
0x58da2a385f1a do_plan
ccan/ccan/io/io.c:422
0x58da2a385fd7 io_ready
ccan/ccan/io/io.c:439
0x58da2a387949 io_loop
ccan/ccan/io/poll.c:455
0x58da2a23bcf1 main
gossipd/gossipd.c:672
0x704350e2a1c9 __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
0x704350e2a28a __libc_start_main_impl
../csu/libc-start.c:360
0x58da2a238874 ???
???:0
0xffffffffffffffff ???
???:0
2024-09-27T02:20:05.267Z BROKEN connectd: STATUS_FAIL_GOSSIP_IO: gossipd exited?
lightningd: connectd failed (exit status 242), exiting.
Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.Lost connection to the RPC socket.

@rustyrussell
Copy link
Contributor

Ack!

This is the same as bc1aabb which is already in master: @ShahanaFarooqui might want to cherry-pick that for the branch instead?

@cdecker
Copy link
Member

cdecker commented Oct 7, 2024

This was already deployed as part of
#7707

@cdecker cdecker closed this Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants