Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix shutdown signals for canto_start #7

Merged

Conversation

kimrutherford
Copy link

Copy link
Owner

@jseager7 jseager7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. The only change I'd suggest is removing --signal-on-hup=QUIT, because the HUP signal is used to gracefully restart the server program, according to the docs for Server::Starter: "to gracefully restart the server program, send SIGHUP to the superdaemon." Starman also "supports HUP for graceful worker restarts".

In keeping with this behaviour, the restart command on the service script uses HUP:

  restart)
    pid=`/bin/cat $CANTO_SPACE/$PID_PATH`
    echo restarting $pid
    (cd $CANTO_SPACE; docker exec canto kill -HUP $pid)
    ;;

Changing the behaviour of the HUP signal to QUIT would presumably mean that a restart actually just stops the server and never starts it again. I haven't tested this to confirm, though.

@kimrutherford
Copy link
Author

The way I read the documentation, the --signal-on-hup=QUIT tells the Server::Starter code that when a HUP is sent to server_starter, it should sent QUIT to its sub-process (canto_start or "starman master" if you run top or ps - see below). And sending QUIT instead of TERM means that the sub-processes of canto_start are shutdown gracefully.

I've tested with --signal-on-hup=QUIT on the PomBase test Canto instance and it seems to work OK. Everything shuts down and restarts, seemingly gracefully.

The documentation confused me until I had a look at the process hierarchy. Here's an example from the PomBase machine. The "starman master" is actually canto_start which the Starman code renamed after starting. It's only running the Plack and Starman code in this process. The "starman worker" processes are also canto_start processes but running the Canto code.

In the docs when they talk about the "server process" they mean the "starman master" process. Well, I think they do. :-)

/usr/bin/perl /usr/local/bin/start_server --pid-file=/var/pomcur/run/test --port 7001 --signal-on-hup=QUIT --signal-on-term=QUIT -- ./script/canto_start --workers 4 --keepalive-timeout 5 -s Starman
   \_ starman master
      \_ starman worker
      \_ starman worker
      \_ starman worker
      \_ starman worker

I found the documentation hard to parse so please let me know if what I have deduced matches your reading of it. I'm happy to chat about it on Skype sometime.

@jseager7 jseager7 dismissed their stale review July 28, 2021 08:59

Requested change was incorrect

@jseager7
Copy link
Owner

Thanks a lot for the clarification. I think I was probably getting confused by the mixed terminology between the Starman and server_starter docs. Your explanation makes sense with regards to the server_starter docs, and you've actually taken the time to test it (unlike me).

In hindsight, it doesn't make a whole lot of sense why the superdaemon would permit overriding its own interface (treating QUIT as HUP) like I was thinking, but overriding the signal sent to the worker processes does make sense.

I'll merge this now.

@jseager7 jseager7 merged commit 8748bc9 into jseager7:init-script-defaults Jul 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants