-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make llm serving template serverless #84
Conversation
@@ -0,0 +1,16 @@ | |||
import requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this to make it a bit more friendly out of the box
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any update needed to readme to run serve run main:my_app
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was already there, I just created the file so the user doesn't need to copy paste code from the markdown cell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/anyscale/product/pull/26531 needs to land before this can merge
@shomilj is multi-az stuff in serverless defaults |
No, this is unrelated to serverless - this is something that should be set in the base advanced config if desired. So we should keep these around in the compute configs @ericl:
|
@shomilj ooc, will serverless config only add worker_node_types? |
@gvspraveen yes, "serverless" is just "auto_select_worker_config" set to true - all it does is handle worker node types |
@shomilj any reason not to enable this by default for all OA configs? It seems strange we have these configs that are required for the top CUJ that are off by default. |
https://github.com/anyscale/product/pull/26531 has been merged. We will likely enable multi-zone by default for OA workloads -- will tackle that as a separate work item this upcoming sprint. So I would add back the aws / gcp advanced config blocks for now, but other than that this is ready to go :) |
Make llm serving template serverless
No description provided.