Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-47656: Re-enable retries in Prompt Processing fan-out service #19

Merged
merged 5 commits into from
Nov 20, 2024

Conversation

kfindeisen
Copy link
Member

This PR reactivates the old retry code, and adds support for a config switch and delayed retries.

One parameter was missing, and others weren't valid Numpydoc.
Retries are only triggered on HTTP 503, which is sent by Prompt
Processing to signal that retries are safe. While most HTTP 502 are
retriable in principle, Knative/KEDA has no way of knowing whether
there might be conflicts with the APDB or central repo.
Retries are now toggled with an environment variable, so that they can
be turned off with a config change. Currently the switch is
Knative-only, since PP's signaling will need to be completely rewritten
for a non-server architecture.
These responses generally ask for a delay to clear the condition that
caused the failure (e.g., a pod or worker restart).
The new log is analogous to the log for first-time requests, just with
"initial" replaced with "retried".
Copy link
Collaborator

@hsinfang hsinfang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you.

@kfindeisen kfindeisen merged commit 3ef17f9 into main Nov 20, 2024
2 checks passed
@kfindeisen kfindeisen deleted the tickets/DM-47656 branch November 20, 2024 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants