-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core feature] Allow flyteadmin to start even if OIDC is unavailable (Improve flyteadmin startup resiliency) #5701
Comments
The issue of FlyteAdmin being blocked from starting up until the OIDC provider is healthy is known. According to Flyte's documentation, FlyteAdmin requires an OIDC provider for authentication when A proposed solution is to allow FlyteAdmin to start even if the OIDC provider is unavailable and re-initialize the OIDC provider later. This approach would improve resiliency and can be made configurable. Currently, a workaround is to disable Flyte until the OIDC provider is available. Would you like more details on how to implement this or any other specific aspect? Sources: This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not. Mention @RunLLM to ask follow-up questions. See runllm.com for more! |
This is a double edged sword. If I recall correctly there was some work done recently to indefinitely cache something related to OIDC on boot. If the OIDC provider is down and you do a normal rolling deployment you could end up in a worse state since you previously had working pods but now you have broken ones. |
@Sovietaced , can you point to this change? Are you thinking of #5621? |
Yeah I think so |
Motivation: Why do you think this is important?
Today, the flyteadmin pod is blocked from starting up until the OIDC provider is healthy and available (the pod gets stuck in Error state). In some Kubernetes configurations, this erroring-pod could cause deployment-wide issues. The current behavior could be made more resilient.
(Note that this applies to configurations using
useAuth=true
)Goal: What should the final outcome look like, ideally?
A better approach in these configurations is to allow flyte to start up, even if the OIDC provider is unavailable. Then, try to re-initialize the OIDC provider later in the deployment lifespan. This is a more resilient approach, and it can be made configurable.
Describe alternatives you've considered
A workaround is to disable Flyte until the OIDC provider is available.
Propose: Link/Inline OR Additional context
Proposed fix here: #5702
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: