Make llm serving template serverless #84

ericl · 2024-02-21T19:55:42Z

No description provided.

Signed-off-by: Eric Liang <ekhliang@gmail.com>

ericl · 2024-02-21T20:10:58Z

templates/intro-services/main.py

@@ -0,0 +1,16 @@
+import requests


Add this to make it a bit more friendly out of the box

Any update needed to readme to run serve run main:my_app?

It was already there, I just created the file so the user doesn't need to copy paste code from the markdown cell.

configs/endpoints_v2/gcp.yaml

shomilj

💯

shomilj

https://github.com/anyscale/product/pull/26531 needs to land before this can merge

akshay-anyscale · 2024-02-22T03:23:33Z

@shomilj is multi-az stuff in serverless defaults

shomilj · 2024-02-22T03:35:52Z

@shomilj is multi-az stuff in serverless defaults

No, this is unrelated to serverless - this is something that should be set in the base advanced config if desired. So we should keep these around in the compute configs @ericl:

aws:
  TagSpecifications:
    - ResourceType: instance
      Tags:
        - Key: as-feature-enable-multi-az-serve
          Value: "true"
        - Key: as-feature-multi-zone
          Value: "true"

gvspraveen · 2024-02-22T03:40:09Z

No, this is unrelated to serverless - this is something that should be set in the base advanced config if desired.

@shomilj ooc, will serverless config only add worker_node_types?

shomilj · 2024-02-22T03:45:17Z

@gvspraveen yes, "serverless" is just "auto_select_worker_config" set to true - all it does is handle worker node types

ericl · 2024-02-22T04:27:15Z

No, this is unrelated to serverless - this is something that should be set in the base advanced config if desired. So we should keep these around in the compute configs @ericl:

@shomilj any reason not to enable this by default for all OA configs? It seems strange we have these configs that are required for the top CUJ that are off by default.

shomilj · 2024-02-23T20:26:34Z

https://github.com/anyscale/product/pull/26531 has been merged.

We will likely enable multi-zone by default for OA workloads -- will tackle that as a separate work item this upcoming sprint. So I would add back the aws / gcp advanced config blocks for now, but other than that this is ready to go :)

Signed-off-by: Eric Liang <ekhliang@gmail.com>

Make llm serving template serverless

update

541699c

Signed-off-by: Eric Liang <ekhliang@gmail.com>

ericl assigned shomilj and akshay-anyscale Feb 21, 2024

pre-create the file for user friendliness

5157e75

ericl commented Feb 21, 2024

View reviewed changes

configs/endpoints_v2/gcp.yaml Show resolved Hide resolved

shomilj approved these changes Feb 21, 2024

View reviewed changes

shomilj reviewed Feb 21, 2024

View reviewed changes

add back

254fa33

Signed-off-by: Eric Liang <ekhliang@gmail.com>

ericl merged commit 5377a9a into main Feb 23, 2024
1 check passed

anmscale pushed a commit that referenced this pull request Jun 22, 2024

Merge pull request #84 from anyscale/serverless-llm

95bc425

Make llm serving template serverless

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make llm serving template serverless #84

Make llm serving template serverless #84

ericl commented Feb 21, 2024

ericl Feb 21, 2024

gvspraveen Feb 22, 2024

ericl Feb 22, 2024

shomilj left a comment

shomilj left a comment

akshay-anyscale commented Feb 22, 2024

shomilj commented Feb 22, 2024 •

edited

Loading

gvspraveen commented Feb 22, 2024 •

edited

Loading

shomilj commented Feb 22, 2024 •

edited

Loading

ericl commented Feb 22, 2024

shomilj commented Feb 23, 2024

Make llm serving template serverless #84

Make llm serving template serverless #84

Conversation

ericl commented Feb 21, 2024

ericl Feb 21, 2024

Choose a reason for hiding this comment

gvspraveen Feb 22, 2024

Choose a reason for hiding this comment

ericl Feb 22, 2024

Choose a reason for hiding this comment

shomilj left a comment

Choose a reason for hiding this comment

shomilj left a comment

Choose a reason for hiding this comment

akshay-anyscale commented Feb 22, 2024

shomilj commented Feb 22, 2024 • edited Loading

gvspraveen commented Feb 22, 2024 • edited Loading

shomilj commented Feb 22, 2024 • edited Loading

ericl commented Feb 22, 2024

shomilj commented Feb 23, 2024

shomilj commented Feb 22, 2024 •

edited

Loading

gvspraveen commented Feb 22, 2024 •

edited

Loading

shomilj commented Feb 22, 2024 •

edited

Loading