You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The developers should have instruments to investigate causes of failure in deployment.
At the very minimum, we should be able to get Rust backtraces whenever a service panics.
But beyond that, I suggest that a share of on-call incidents should be followed up by analysis of the failure attached to the OnCall service, tracing the cause to a GitHub bug issue with the post-mortem information, or if the information at hand is insufficient, an issue suggesting additional instrumentation to troubleshoot the problem.
Seeing how incidents tend to be repetitive, a developer on call should not be obliged to investigated every incident that occurred during their shift and was automatically resolved, but at least one incident per shift should be investigated.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
The developers should have instruments to investigate causes of failure in deployment.
At the very minimum, we should be able to get Rust backtraces whenever a service panics.
But beyond that, I suggest that a share of on-call incidents should be followed up by analysis of the failure attached to the OnCall service, tracing the cause to a GitHub bug issue with the post-mortem information, or if the information at hand is insufficient, an issue suggesting additional instrumentation to troubleshoot the problem.
Seeing how incidents tend to be repetitive, a developer on call should not be obliged to investigated every incident that occurred during their shift and was automatically resolved, but at least one incident per shift should be investigated.
Beta Was this translation helpful? Give feedback.
All reactions