You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reflected from private internal issue tracker by @chakrn
When we use caikit-ray-backend to submit a new job, that job can run indefinitely with no timeout. We need a way to have a configurable timeout value and cancel the job after exceeding the time.
After having a quick discussion with Dean, this can probably be done simply by changing the ray.get() to a ray.wait() (which should've been the case anyway). Then poll for status and kill the job after a certain elapsed time.
The text was updated successfully, but these errors were encountered:
Then poll for status and kill the job after a certain elapsed time.
We need to make the time for timeout configurable. We will be using get_config function from caikit for this, example import statement and example usage. This allows configuration of these variables in multiple fashions, i.e one can configure it via config.yml or can set a value via environment variable.
ibm-peach-fish
added a commit
to ibm-peach-fish/caikit-ray-backend
that referenced
this issue
Oct 24, 2023
reflected from private internal issue tracker by @chakrn
When we use caikit-ray-backend to submit a new job, that job can run indefinitely with no timeout. We need a way to have a configurable timeout value and cancel the job after exceeding the time.
After having a quick discussion with Dean, this can probably be done simply by changing the ray.get() to a ray.wait() (which should've been the case anyway). Then poll for status and kill the job after a certain elapsed time.
The text was updated successfully, but these errors were encountered: