Skip to content

Commit

Permalink
Reduce ContainerRestarting alert noise
Browse files Browse the repository at this point in the history
Once fired, leave it firing for 10m. That should help with crashloops
where the alert keeps getting resolved and firing again.
  • Loading branch information
hectorhuertas committed Oct 10, 2023
1 parent 15966f8 commit 26cdc3a
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 0 deletions.
1 change: 1 addition & 0 deletions common/all.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ groups:
dashboard: "https://grafana.$ENVIRONMENT.$PROVIDER.uw.systems/d/VAE0wIcik/kubernetes-pod-resources?orgId=1&refresh=1m&from=now-12h&to=now&var-instance=All&var-namespace={{ $labels.namespace }}"
- alert: SystemPodRestartingOften
expr: increase(kube_pod_container_status_restarts_total{namespace=~"kube-system|sys-.*"}[10m]) > 3
keep_firing_for: 10m
labels:
team: infra
annotations:
Expand Down
1 change: 1 addition & 0 deletions common/container.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ groups:
rules:
- alert: ContainerRestartingOften
expr: increase(kube_pod_container_status_restarts_total[10m]) > 3
keep_firing_for: 10m
labels:
group: container
annotations:
Expand Down

0 comments on commit 26cdc3a

Please sign in to comment.