Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example demonstrating graceful shutdown #2

Merged
merged 1 commit into from
Nov 9, 2023

Conversation

brandur
Copy link
Contributor

@brandur brandur commented Nov 9, 2023

As I was writing the docs for graceful shutdown, I realized that it
wasn't a half bad idea to include a full code sample for a realistic
program that when exiting, would (1) try a soft stop to start, and (2)
do a hard stop if the soft stop didn't work in time.

This is somewhat non-trivial code though, and I'd be afraid to just
include it it docs without testing it. I started writing the code to be
actually runnable, and realized that as long as it was going to be
runnable anyway, we may as well also include an example for it and
commit to the repository.

So that's what we do here: add an example test that shows a realistic
shutdown loop demonstrating both a soft and hard stop.

// finish on context cancellation.
fmt.Printf("Soft stop succeeded\n")

case <-time.After(100 * time.Millisecond):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super happy with this wait. It's not strictly necessary to include it, but if we don't, then we don't show any opportunity for the soft stop to work at all, so I put it in.

@brandur brandur requested a review from bgentry November 9, 2023 03:08
As I was writing the docs for graceful shutdown, I realized that it
wasn't a half bad idea to include a full code sample for a realistic
program that when exiting, would (1) try a soft stop to start, and (2)
do a hard stop if the soft stop didn't work in time.

This is somewhat non-trivial code though, and I'd be afraid to just
include it it docs without testing it. I started writing the code to be
actually runnable, and realized that as long as it was going to be
runnable anyway, we may as well also include an example for it and
commit to the repository.

So that's what we do here: add an example test that shows a realistic
shutdown loop demonstrating both a soft and hard stop.
@brandur brandur force-pushed the brandur-graceful-shutdown-example branch from de1d2bd to 3640eff Compare November 9, 2023 03:09
@brandur
Copy link
Contributor Author

brandur commented Nov 9, 2023

TY.

@brandur brandur merged commit f246c20 into master Nov 9, 2023
5 checks passed
@brandur brandur deleted the brandur-graceful-shutdown-example branch November 9, 2023 03:44
brandur added a commit that referenced this pull request Nov 10, 2023
Follows up #2 with a few fixes/changes. As I was writing documentation
for graceful shutdown I realized there were a few things that weren't
quite ideal:

* The job should return an error in the event of cancellation so that it
  can be persisted as errored and be worked again.

* The example now respects either `SIGINT` or `SIGTERM`. `SIGTERM` is
  what's used on Heroku, but `SIGINT` is the standard signal from
  `Ctrl+C` in a terminal, so by respecting both we can have a program
  that works well in either development or a common hosted environment.

* Add a third phase in which the program initiates an unclean stop by
  not waiting on stop any longer. This is probably something that most
  programs should have because it's going to be reasonably easy to write
  workers that accidentally don't respect context cancellation and get
  stuck.

* Add a 10 second timeout to each phase. This is for Heroku's benefit.
  It'll send one `SIGTERM` and wait 30 seconds before issuing `SIGKILL`.
  So the program here waits 10 seconds for a soft stop, another 10
  seconds for a hard stop, and then exits uncleanly on its own volition
  before getting `SIGKILL`ed.
brandur added a commit that referenced this pull request Nov 10, 2023
Follows up #2 with a few fixes/changes. As I was writing documentation
for graceful shutdown I realized there were a few things that weren't
quite ideal:

* The job should return an error in the event of cancellation so that it
  can be persisted as errored and be worked again.

* The example now respects either `SIGINT` or `SIGTERM`. `SIGTERM` is
  what's used on Heroku, but `SIGINT` is the standard signal from
  `Ctrl+C` in a terminal, so by respecting both we can have a program
  that works well in either development or a common hosted environment.

* Add a third phase in which the program initiates an unclean stop by
  not waiting on stop any longer. This is probably something that most
  programs should have because it's going to be reasonably easy to write
  workers that accidentally don't respect context cancellation and get
  stuck.

* Add a 10 second timeout to each phase. This is for Heroku's benefit.
  It'll send one `SIGTERM` and wait 30 seconds before issuing `SIGKILL`.
  So the program here waits 10 seconds for a soft stop, another 10
  seconds for a hard stop, and then exits uncleanly on its own volition
  before getting `SIGKILL`ed.
brandur added a commit that referenced this pull request Nov 10, 2023
Follows up #2 with a few fixes/changes. As I was writing documentation
for graceful shutdown I realized there were a few things that weren't
quite ideal:

* The job should return an error in the event of cancellation so that it
  can be persisted as errored and be worked again.

* The example now respects either `SIGINT` or `SIGTERM`. `SIGTERM` is
  what's used on Heroku, but `SIGINT` is the standard signal from
  `Ctrl+C` in a terminal, so by respecting both we can have a program
  that works well in either development or a common hosted environment.

* Add a third phase in which the program initiates an unclean stop by
  not waiting on stop any longer. This is probably something that most
  programs should have because it's going to be reasonably easy to write
  workers that accidentally don't respect context cancellation and get
  stuck.

* Add a 10 second timeout to each phase. This is for Heroku's benefit.
  It'll send one `SIGTERM` and wait 30 seconds before issuing `SIGKILL`.
  So the program here waits 10 seconds for a soft stop, another 10
  seconds for a hard stop, and then exits uncleanly on its own volition
  before getting `SIGKILL`ed.
brandur added a commit that referenced this pull request Nov 10, 2023
Follows up #2 with a few fixes/changes. As I was writing documentation
for graceful shutdown I realized there were a few things that weren't
quite ideal:

* The job should return an error in the event of cancellation so that it
  can be persisted as errored and be worked again.

* The example now respects either `SIGINT` or `SIGTERM`. `SIGTERM` is
  what's used on Heroku, but `SIGINT` is the standard signal from
  `Ctrl+C` in a terminal, so by respecting both we can have a program
  that works well in either development or a common hosted environment.

* Add a third phase in which the program initiates an unclean stop by
  not waiting on stop any longer. This is probably something that most
  programs should have because it's going to be reasonably easy to write
  workers that accidentally don't respect context cancellation and get
  stuck.

* Add a 10 second timeout to each phase. This is for Heroku's benefit.
  It'll send one `SIGTERM` and wait 30 seconds before issuing `SIGKILL`.
  So the program here waits 10 seconds for a soft stop, another 10
  seconds for a hard stop, and then exits uncleanly on its own volition
  before getting `SIGKILL`ed.
brandur added a commit that referenced this pull request Nov 10, 2023
Follows up #2 with a few fixes/changes. As I was writing documentation
for graceful shutdown I realized there were a few things that weren't
quite ideal:

* The job should return an error in the event of cancellation so that it
  can be persisted as errored and be worked again.

* The example now respects either `SIGINT` or `SIGTERM`. `SIGTERM` is
  what's used on Heroku, but `SIGINT` is the standard signal from
  `Ctrl+C` in a terminal, so by respecting both we can have a program
  that works well in either development or a common hosted environment.

* Add a third phase in which the program initiates an unclean stop by
  not waiting on stop any longer. This is probably something that most
  programs should have because it's going to be reasonably easy to write
  workers that accidentally don't respect context cancellation and get
  stuck.

* Add a 10 second timeout to each phase. This is for Heroku's benefit.
  It'll send one `SIGTERM` and wait 30 seconds before issuing `SIGKILL`.
  So the program here waits 10 seconds for a soft stop, another 10
  seconds for a hard stop, and then exits uncleanly on its own volition
  before getting `SIGKILL`ed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants