Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: restore SIGHUP in signal handler #1305

Closed

Conversation

ThomasFrans
Copy link
Contributor

Restore the SIGHUP signal handler which was accidentally removed in a067ab2.

Restore the SIGHUP signal handler which was accidentally removed in
a067ab2.
@ThomasFrans
Copy link
Contributor Author

I should have checked the previous PR better to make sure I didn't mix them up. This fixes it.

@ThomasFrans ThomasFrans marked this pull request as draft October 8, 2023 12:41
@ThomasFrans
Copy link
Contributor Author

@hrkfdn I'm keeping this one as a draft for now because I found a pretty annoying issue that occurs when this is merged:

When ncspot is closed normally, it exits like you'd expect.
When ncspot's pane or tab in zellij is closed, it exits like you'd expect.
When you send SIGTERM or SIGHUP to ncspot, it exits like you'd expect.
BUT when you open it in Alacritty and close the terminal, it SOMETIMES closes, and other times it goes into a loop and spikes CPU usage on one core.

I don't know the difference between closing the tab or closing the terminal emulator. I thought it should behave the same. I think it would be best to wait with merging this and maybe even revert the previous merge as it might also show this behavior. I haven't had the chance to test yet, just want to make sure this isn't released or kept in main for too long as it's pretty annoying.

Sorry for these two bad PRs, but it has been hard to test every possible combination (signals, terminal multiplexer, terminal itself...) between the two PRs (SIGHUP and async signal handling). I also assumed that if it worked when closing it in a multiplexer pane, it would also work when closing the terminal. I'll see whether I can improve the async version and make sure it works properly.

@hrkfdn
Copy link
Owner

hrkfdn commented Oct 8, 2023

No worries, I am also able to reproduce the delayed termination as well as the CPU hogging on macOS, but didn't have time to investigate further.

@ThomasFrans
Copy link
Contributor Author

I tried to look further into this and added some logging to see exactly where the problem occurs (my debugger doesn't work well with async stuff and will instead open a file with assembly, which is not useful).

Log output when manually sending SIGTERM (filtered with grep -e posix -e handle -e Quit):

[2023-10-10][16:37:42] [ncspot::application] [TRACE] posix signal handler task: started
[2023-10-10][16:37:43] [ncspot::commands] [TRACE] handle move right 1 command
[2023-10-10][16:37:44] [ncspot::commands] [TRACE] handle move right 1 command
[2023-10-10][16:37:44] [ncspot::commands] [TRACE] handle move left 1 command
[2023-10-10][16:37:44] [ncspot::commands] [TRACE] handle move left 1 command
[2023-10-10][16:37:46] [ncspot::application] [DEBUG] posix signal handler task: received 15
[2023-10-10][16:37:46] [ncspot::application] [DEBUG] posix signal handler task: sending `Quit` command
[2023-10-10][16:37:46] [ncspot::commands] [TRACE] handle quit command

Log output when quit by closing Zellij pane (filtered with grep -e posix -e handle -e Quit):

[2023-10-10][16:41:29] [ncspot::application] [TRACE] posix signal handler task: started
[2023-10-10][16:41:30] [ncspot::commands] [TRACE] handle move right 1 command
[2023-10-10][16:41:30] [ncspot::commands] [TRACE] handle move right 1 command
[2023-10-10][16:41:30] [ncspot::commands] [TRACE] handle move left 1 command
[2023-10-10][16:41:31] [ncspot::commands] [TRACE] handle move left 1 command
[2023-10-10][16:41:32] [ncspot::application] [DEBUG] posix signal handler task: received 1
[2023-10-10][16:41:32] [ncspot::application] [DEBUG] posix signal handler task: sending `Quit` command
[2023-10-10][16:41:32] [ncspot::application] [DEBUG] posix signal handler task: received 1
[2023-10-10][16:41:32] [ncspot::application] [DEBUG] posix signal handler task: sending `Quit` command
[2023-10-10][16:41:32] [ncspot::commands] [TRACE] handle quit command

The case where it's closed from the terminal gives this output (filtered with grep -e posix -e handle -e Quit):

[2023-10-10][16:39:30] [ncspot::application] [TRACE] posix signal handler task: started
[2023-10-10][16:39:31] [ncspot::commands] [TRACE] handle move right 1 command
[2023-10-10][16:39:31] [ncspot::commands] [TRACE] handle move right 1 command
[2023-10-10][16:39:32] [ncspot::commands] [TRACE] handle move left 1 command
[2023-10-10][16:39:33] [ncspot::commands] [TRACE] handle move left 1 command
[2023-10-10][16:39:33] [ncspot::application] [DEBUG] posix signal handler task: received 1
[2023-10-10][16:39:33] [ncspot::application] [DEBUG] posix signal handler task: sending `Quit` command
[2023-10-10][16:39:33] [ncspot::application] [DEBUG] posix signal handler task: received 1
[2023-10-10][16:39:33] [ncspot::application] [DEBUG] posix signal handler task: sending `Quit` command

Adding further logging shows that the signal handler sends the closure which executes the Quit command through the channel to Cursive, but it is never executed by Cursive in the main thread. I don't really understand why it would be executed every time except when closing the controlling terminal. I'm a bit lost here to be honest.

@ThomasFrans
Copy link
Contributor Author

ThomasFrans commented Oct 10, 2023

The event loop blocks on something when the callback is sent from the signal handler task. I'm still kind of lost...

[2023-10-10][19:56:12] [ncspot::application] [TRACE] event loop: start step
[2023-10-10][19:56:12] [ncspot::application] [TRACE] event loop: cursive step
[2023-10-10][19:56:12] [ncspot::application] [DEBUG] posix signal handler task: received 1
[2023-10-10][19:56:12] [ncspot::application] [DEBUG] posix signal handler task: handling SIGTERM or SIGHUP
[2023-10-10][19:56:12] [ncspot::application] [DEBUG] posix signal handler task: received 1
[2023-10-10][19:56:12] [ncspot::application] [DEBUG] posix signal handler task: handling SIGTERM or SIGHUP
[2023-10-10][19:56:12] [ncspot::application] [TRACE] event loop: process ncspot events
[2023-10-10][19:56:12] [ncspot::application] [TRACE] event loop: end step
[2023-10-10][19:56:12] [ncspot::application] [TRACE] event loop: start step
[2023-10-10][19:56:12] [ncspot::application] [TRACE] event loop: cursive step
** Event loop blocks somewhere right after receiving the signal **

@ThomasFrans
Copy link
Contributor Author

From what I can see, it's an issue introduced by the crossterm backend. The ncurses one doesn't keep the process running. Currently I suspect Cursive is doing a write to /dev/tty which is closed, and that write blocks on the crossterm backend and returns immediately on the ncurses backend. I guess it could be resolved by switching the default backend (again)?

@hrkfdn This would solve #1309 and #1310 and #1282. I'd love to hear thoughts on this idea. If it's an OK solution, I'd try to see whether I can open an issue at Cursive or Crossterm to resolve the issue there, and meanwhile switch backend and add the SIGHUP handler back to finally close all these issues 😅

@ThomasFrans
Copy link
Contributor Author

ThomasFrans commented Oct 11, 2023

Oh FFS, I'm so pissed. This happens every time! You put hours into solving an issue only to find it's a known issue elsewhere 🤣. Lesson learned (he says as he will just do the same thing next time) crossterm-rs/crossterm#793

@hrkfdn
Copy link
Owner

hrkfdn commented Oct 12, 2023

Ugh, there really is no perfect backend it seems. Switching back to Termion will break Shift+Up/Shift+Down keybindings to move items around in the queue. Crossterm has this CPU hogging bug. Pancurses doesn't work well with 256 colors.. 😭

@ThomasFrans
Copy link
Contributor Author

TBH Crossterm seems pretty dea... on vacation for now. Termion is... what is the right word... retired? Pancurses has been chilling since 2021. I think the safest, most robust option right now would be to switch to ncurses by default as it's the most well known library. It's not that I don't believe in the other ones, but terminals are complex to handle and ncurses has the advantage of using the terminfo library, which the other backends don't (I think).

@hrkfdn
Copy link
Owner

hrkfdn commented Oct 12, 2023

Yeah, I'm hoping to have to time to finally look at all of this (and other issues/PRs) this weekend. Been super busy these past weeks unfortunately, sorry for all the delays..

@ThomasFrans
Copy link
Contributor Author

No problem at all. I understand that maintainers aren't available 24/7. I'm just making PRs as I'm fixing/improving things and I know it can take some time before they are discussed. Until then, I'll just work on other stuff 🙂

@ThomasFrans ThomasFrans mentioned this pull request Oct 12, 2023
@ThomasFrans ThomasFrans deleted the fix-restore-sighup-handler branch January 27, 2024 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants