Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only enforce scheduler_key unique index on first execution #6

Merged

Conversation

jujustayfly
Copy link
Contributor

Currently if a job is periodically scheduled but also long running (through multiple workloads), part of it's execution gets skipped. This is because we currently have a unique index on the scheduler_key, which gets copied over to subsequent workload. This means that if the next scheduled job is already enqueued (no matter how far in the future), any new workload from the currently executing job gets swallowed by the conflict on the unique index.

This is particularly an issue when using job-iteration which relies on interrupting and re-enqueueing many workloads for the same active job (over a long period of time)

The proposed change is to change the unique index to only affect workloads where the executions count is 0. This is acceptable because jobs will still respect execution_concurrency_key and enqueue_concurrency_key unique indexes. Retry policies can also be configured on the jobs themselves if needed.

@svanhesteren
Copy link
Collaborator

Hey cool find! Couple of things: I'm not 100% sure what all the implications will be, a test showing the issue and resolutions would help for sure.

Next to that, by just changing the index in the Gouda class the existing indexes in the database won't get updated. We need a gouda:update task next to the gouda:install task. Good job has this too, so we can just do it like that. Perhaps I'll implement that separately outside of this pr.

@julik
Copy link
Contributor

julik commented Jun 14, 2024

What happens if there is no serialized_params or no executions within it? How will PG react? If we need to know which Workload was the first for a particular job - maybe pull it into the table proper, so that there is always a value, always the correct type etc.?

@jujustayfly
Copy link
Contributor Author

I wonder if a lighter solution to this would be to not copy the scheduler key on workloads from retries 🤔 The original job could still be tracked using the active_job_id

@jujustayfly
Copy link
Contributor Author

I have opted for the lighter solution of only setting a scheduler_key on the initial execution of an active job. This way no migration will be required. @svanhesteren I see we had explicitly written code to always copy the scheduler_key on all executions but I am not sure that is the desired behavior knowing that we also have enqueue and execution concurrency mechanism. Wdyt?

@jujustayfly jujustayfly requested a review from svanhesteren June 17, 2024 13:50
Copy link
Collaborator

@svanhesteren svanhesteren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! A nice a simple solution. 🚀

@jujustayfly jujustayfly merged commit 74f53a9 into main Jun 18, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants