-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 27: consider adding flags to alloc request #423
Comments
In #5378, @ryanday36 mentioned an alternate idea to a standby flag: allow a duration range in jobspec, where a standby job would specify a min < max duration and the job would be eligible for preemption after the min duration expires. For traditional standby, min = 0, but it would allow for a job to request to get a minimum amount of work done before preemption. It seems like we discussed this and decided it was a better idea than the flag but I can't find anything to cite here. A duration range would require an update to Shall we change this issue from flag to duration range? |
Yes, that's a good idea. I wonder if it would be easier to add an optional minimum duration instead of turning duration into a range though... |
BTW, I just recalled that when we discussed this proposal of a minimum duration one possible solution was to still use the preemptible flag, but to have the job-manager post the flag to the eventlog as soon as minimum runtime expired. We should explore if that makes sense, or if we leave it up to the scheduler to look for any minimum runtime and implement preemption internally without need of any kind of flags. |
Yeah good thought, since I think we would amend v1 jobspec and if we did the min/max thing (like the way Maybe
Yes I like the idea of setting the flag to centralize processing of this time stuff. If exceeding RFC 27 would not only need flags added to the alloc request and hello response, but also a new RPC for setting flags on existing jobs. |
Good suggestion! |
Well, thinking about this a bit more, it seems like the scheduler will want to incorporate future preemptibility into its "plan", so I'm not sure how useful it is to know when a job becomes preemptible in real time? Maybe it would be better to just have the scheduler raise a fatal exception on a job when it reaches that point in its schedule and skip the notification? |
Yeah, that's a good thought. We should get the opinion of Fluxion developers here, since the original plan was to use a flag all along. I imagine it is much easier to support scheduling of preemptible jobs that would have Either way it doesn't seem like the flag is useful, so we can probably close this issue or change it to add the |
Before we discard that idea completely we might want to try a little prototype of |
Does libschedutil examine jobspec? In any event, an experiment with sched-simple is a good idea. Since sched-simple is not a planning scheduler, preemptible jobs would not have much effect if they are submitted behind a pending job. However, it could be used to test that submission of a high priority job kills off any existing preemptible jobs if that would allow the job to run. 🤷 |
In flux-framework/flux-core#5739 there is a proposal to add a flag to preemptible/standby jobs. A scheduler can then use this flag to determine if a job may be canceled to make way for a higher priority job. However, currently flags are not shared with the scheduler, which would have to read the
submit
event to get the flags.We should consider adding flags to the alloc request (which contains most of the rest of the submit event context).
The text was updated successfully, but these errors were encountered: