-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update IsManagedPod to check for job run id and job id #3620
Conversation
We're going to double check with one of the core devs tomorrow, but most likely we need both labels on the pod. I think what we want to do is validate that the executor has given us valid run ids and reject their update if they haven't. |
I think adding validation on the server side is definitely ideal. I'd be happy to add that as well just let me know! |
@Sovietaced - by all means! |
@dave-gantenbein should be good for review. I think the go mod CI job flaked so should be unrelated. |
Got a unit test flake that seems unrelated
|
@Sovietaced Can you rebase this branch from |
…de job run ids Signed-off-by: Jason Parraga <sovietaced@gmail.com>
Signed-off-by: Jason Parraga <sovietaced@gmail.com>
8985b5a
to
5b7638b
Compare
Signed-off-by: Jason Parraga <sovietaced@gmail.com>
done |
@Sovietaced - we discussed this and while it won't cause any negative impact, I'm wondering if you've updated things so the executors and server are on the same version, which would be the recommended pattern. If there's something blocking you from doing that let me know here or in slack and we can merge this, but again this isn't how we'd recommend you go about setting things up long term, as other issues could potentially creep in and we won't be testing this scenario... |
We haven't updated things yet and have been running a fork but I have already mentioned I realize this is not a supported deployment. I don't think the PR makes the codebase any worse considering there is no validation being done in the scheduler API but you do you. |
We had an issue where a v3 and v4 executor were running on the same cluster. The v4 executor was pulling empty job run ids from pods launched by the v3 executor. We realize that we shouldn't run two executors in the same cluster however it also feels like the filtering logic in the cluster utilization logic should just check for the job run id label if that is what it is ultimately going to pick out of the pod and assume is present.
Fixes: #3618