DagsterExecutionInterruptedError in runs on Kubernetes #12943
-
My |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 6 replies
-
In the Dagster event log you should see an event with the string
followed by a Kubernetes job name and namespace. With those, you can use
For example, you might see:
|
Beta Was this translation helpful? Give feedback.
-
This can also happen when the kubernetes cluster decides to evict the pod where your run is happening, for example to move it to a new node. Depending on the way your cluster is set up, you may be able to signal via annotations that your pod should not be evicted in this way. For example, if the cluster autoscaler is evicting your pod, you can apply the following annotation to your pods to prevent them from being evicted (see https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node for more information):
See https://docs.dagster.io/deployment/guides/kubernetes/customizing-your-deployment#instance-level-kubernetes-configuration for more information on how to apply annotations to the Dagster run pods. In your Helm chart, config that does this might look something like this:
|
Beta Was this translation helpful? Give feedback.
-
Does the Agent also trigger this when a code update is detected? We've noticed that many of these happen when we do code updates. |
Beta Was this translation helpful? Give feedback.
DagsterExecutionInterruptedError
means the process received aSIGINT
orSIGTERM
, most commonly this means the K8s pod the process was running on was terminated by the Kubernetes cluster. You can usekubectl
(or other Kubernetes tools) to investigate.In the Dagster event log you should see an event with the string
followed by a Kubernetes job name and namespace. With those, you can use
kubectl
to look at the underlying job.For example, you might see: