Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 27: consider adding flags to alloc request #423

Open
grondo opened this issue Aug 15, 2024 · 9 comments
Open

RFC 27: consider adding flags to alloc request #423

grondo opened this issue Aug 15, 2024 · 9 comments

Comments

@grondo
Copy link
Contributor

grondo commented Aug 15, 2024

In flux-framework/flux-core#5739 there is a proposal to add a flag to preemptible/standby jobs. A scheduler can then use this flag to determine if a job may be canceled to make way for a higher priority job. However, currently flags are not shared with the scheduler, which would have to read the submit event to get the flags.

We should consider adding flags to the alloc request (which contains most of the rest of the submit event context).

@garlick
Copy link
Member

garlick commented Nov 7, 2024

In #5378, @ryanday36 mentioned an alternate idea to a standby flag: allow a duration range in jobspec, where a standby job would specify a min < max duration and the job would be eligible for preemption after the min duration expires. For traditional standby, min = 0, but it would allow for a job to request to get a minimum amount of work done before preemption. It seems like we discussed this and decided it was a better idea than the flag but I can't find anything to cite here.

A duration range would require an update to

Shall we change this issue from flag to duration range?

@grondo
Copy link
Contributor Author

grondo commented Nov 7, 2024

Yes, that's a good idea. I wonder if it would be easier to add an optional minimum duration instead of turning duration into a range though...

@grondo
Copy link
Contributor Author

grondo commented Nov 7, 2024

BTW, I just recalled that when we discussed this proposal of a minimum duration one possible solution was to still use the preemptible flag, but to have the job-manager post the flag to the eventlog as soon as minimum runtime expired. We should explore if that makes sense, or if we leave it up to the scheduler to look for any minimum runtime and implement preemption internally without need of any kind of flags.

@garlick
Copy link
Member

garlick commented Nov 12, 2024

I wonder if it would be easier to add an optional minimum duration instead of turning duration into a range though..

Yeah good thought, since I think we would amend v1 jobspec and if we did the min/max thing (like the way count is defined in RFC 14) then we couldn't roll back to an earlier release without having newer jobs in the queue potentially contain invalid jobspec.

Maybe preemptible-after?

I just recalled that when we discussed this proposal of a minimum duration one possible solution was to still use the preemptible flag, but to have the job-manager post the flag to the eventlog as soon as minimum runtime expired. We should explore if that makes sense, or if we leave it up to the scheduler to look for any minimum runtime and implement preemption internally without need of any kind of flags.

Yes I like the idea of setting the flag to centralize processing of this time stuff. If exceeding preemptible-after causes a set-flags event to be posted to the job eventlog, then job-list can easily track the flag as well.

RFC 27 would not only need flags added to the alloc request and hello response, but also a new RPC for setting flags on existing jobs.

@grondo
Copy link
Contributor Author

grondo commented Nov 12, 2024

Maybe preemptible-after?

Good suggestion!

@garlick
Copy link
Member

garlick commented Nov 12, 2024

Well, thinking about this a bit more, it seems like the scheduler will want to incorporate future preemptibility into its "plan", so I'm not sure how useful it is to know when a job becomes preemptible in real time? Maybe it would be better to just have the scheduler raise a fatal exception on a job when it reaches that point in its schedule and skip the notification?

@grondo
Copy link
Contributor Author

grondo commented Nov 12, 2024

Yeah, that's a good thought. We should get the opinion of Fluxion developers here, since the original plan was to use a flag all along.

I imagine it is much easier to support scheduling of preemptible jobs that would have preemptible-after=0, than those with a minimum runtime. If a nontrivial amount of effort will be required to support preemptible-after, then it might be best to complete this work in stages: Support only preemptible-after=0 at first (essentially treating it as a flag), then later add the ability to extend this to nonzero values.

Either way it doesn't seem like the flag is useful, so we can probably close this issue or change it to add the preemptible-after attribute, which is only supported by Fluxion.

@garlick
Copy link
Member

garlick commented Nov 12, 2024

Either way it doesn't seem like the flag is useful, so we can probably close this issue or change it to add the preemptible-after attribute, which is only supported by Fluxion.

Before we discard that idea completely we might want to try a little prototype of preemptible-after with sched-simple. At minimum, we may find that some libschedutil support falls out of it that will save fluxion developers some work. Like I dunno, a wake up timer when running jobs become preemptible since otherwise the scheduler might not have any other event to wake up on?

@grondo
Copy link
Contributor Author

grondo commented Nov 12, 2024

Does libschedutil examine jobspec? In any event, an experiment with sched-simple is a good idea.

Since sched-simple is not a planning scheduler, preemptible jobs would not have much effect if they are submitted behind a pending job. However, it could be used to test that submission of a high priority job kills off any existing preemptible jobs if that would allow the job to run. 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants