Skip to content

Commit

Permalink
preparation for 1.0.0 release
Browse files Browse the repository at this point in the history
  • Loading branch information
alpayariyak committed Jun 12, 2024
1 parent bad5ddd commit c8458fe
Show file tree
Hide file tree
Showing 4 changed files with 10 additions and 12 deletions.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
ARG WORKER_CUDA_VERSION=11.8.0
ARG BASE_IMAGE_VERSION=1.0.0preview
ARG BASE_IMAGE_VERSION=1.0.0
FROM runpod/worker-vllm:base-${BASE_IMAGE_VERSION}-cuda${WORKER_CUDA_VERSION} AS vllm-base

RUN apt-get update -y \
Expand Down
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

# OpenAI-Compatible vLLM Serverless Endpoint Worker
Deploy OpenAI-Compatible Blazing-Fast LLM Endpoints powered by the [vLLM](https://github.com/vllm-project/vllm) Inference Engine on RunPod Serverless with just a few clicks.

<!--
![vLLM Version](https://img.shields.io/badge/dynamic/yaml?url=https%3A%2F%2Fraw.githubusercontent.com%2Frunpod-workers%2Fworker-vllm%2Fmain%2Fvllm-base-image%2Fvllm-metadata.yml&query=%24.version&style=for-the-badge&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPCFET0NUWVBFIHN2ZyBQVUJMSUMgIi0vL1czQy8vRFREIFNWRyAxLjEvL0VOIiAiaHR0cDovL3d3dy53My5vcmcvR3JhcGhpY3MvU1ZHLzEuMS9EVEQvc3ZnMTEuZHRkIj4KPHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZlcnNpb249IjEuMSIgd2lkdGg9IjU1cHgiIGhlaWdodD0iNTZweCIgc3R5bGU9InNoYXBlLXJlbmRlcmluZzpnZW9tZXRyaWNQcmVjaXNpb247IHRleHQtcmVuZGVyaW5nOmdlb21ldHJpY1ByZWNpc2lvbjsgaW1hZ2UtcmVuZGVyaW5nOm9wdGltaXplUXVhbGl0eTsgZmlsbC1ydWxlOmV2ZW5vZGQ7IGNsaXAtcnVsZTpldmVub2RkIiB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayI%2BCjxnPjxwYXRoIHN0eWxlPSJvcGFjaXR5OjEiIGZpbGw9IiMzN2E0ZmUiIGQ9Ik0gNTEuNSwwLjUgQyA0Ni41ODIyLDE4LjA4MzggNDEuOTE1NiwzNS43NTA1IDM3LjUsNTMuNUMgMzIuMTY2Nyw1My41IDI2LjgzMzMsNTMuNSAyMS41LDUzLjVDIDIwLjgzMzMsNTMuNSAyMC41LDUzLjE2NjcgMjAuNSw1Mi41QyAyMS4zMzgyLDUyLjE1ODMgMjEuNjcxNiw1MS40OTE2IDIxLjUsNTAuNUMgMjIuMjIyOSw0Ni44NTU1IDIzLjIyMjksNDMuMTg4OSAyNC41LDM5LjVDIDI0LjY5MTcsMzYuMzk5MiAyNS4zNTg0LDMzLjM5OTIgMjYuNSwzMC41QyAyNi4yOTA3LDI5LjkxNCAyNS45NTc0LDI5LjQxNCAyNS41LDI5QyAyNy40NDE0LDI3LjE4NDEgMjguMTA4MSwyNS4xODQxIDI3LjUsMjNDIDI5LjI0MTUsMTguNTM4NyAzMC45MDgyLDE0LjAzODcgMzIuNSw5LjVDIDM4Ljc3NTcsNi4xOTM1OCA0NS4xMDkxLDMuMTkzNTggNTEuNSwwLjUgWiIvPjwvZz4KPGc%2BPHBhdGggc3R5bGU9Im9wYWNpdHk6MC45ODQiIGZpbGw9IiNmY2I3MWQiIGQ9Ik0gMjIuNSwxMi41IEMgMjEuNTA0NiwyNC45ODkgMjEuMTcxMywzNy42NTU3IDIxLjUsNTAuNUMgMjEuNjcxNiw1MS40OTE2IDIxLjMzODIsNTIuMTU4MyAyMC41LDUyLjVDIDEzLjAzMTEsMzkuMjI4NyA2LjM2NDQxLDI1LjU2MjEgMC41LDExLjVDIDguMDE5MDUsMTEuMTc1IDE1LjM1MjQsMTEuNTA4NCAyMi41LDEyLjUgWiIvPjwvZz4KPGc%2BPHBhdGggc3R5bGU9Im9wYWNpdHk6MC4wMiIgZmlsbD0iI2Q3ZGZlOCIgZD0iTSAyMi41LDEyLjUgQyAyMy4xNjY3LDIxLjUgMjMuODMzMywzMC41IDI0LjUsMzkuNUMgMjMuMjIyOSw0My4xODg5IDIyLjIyMjksNDYuODU1NSAyMS41LDUwLjVDIDIxLjE3MTMsMzcuNjU1NyAyMS41MDQ2LDI0Ljk4OSAyMi41LDEyLjUgWiIvPjwvZz4KPGc%2BPHBhdGggc3R5bGU9Im9wYWNpdHk6MC43NTMiIGZpbGw9IiNjZmQ2ZGQiIGQ9Ik0gNTEuNSwwLjUgQyA1Mi42MTI5LDEuOTQ2MzkgNTIuNzc5NiwzLjYxMzA1IDUyLDUuNUMgNDcuODAzNiwyMi4yODg3IDQzLjMwMzYsMzguOTU1MyAzOC41LDU1LjVDIDMyLjUsNTUuNSAyNi41LDU1LjUgMjAuNSw1NS41QyAyMC44MzMzLDU0LjgzMzMgMjEuMTY2Nyw1NC4xNjY3IDIxLjUsNTMuNUMgMjYuODMzMyw1My41IDMyLjE2NjcsNTMuNSAzNy41LDUzLjVDIDQxLjkxNTYsMzUuNzUwNSA0Ni41ODIyLDE4LjA4MzggNTEuNSwwLjUgWiIvPjwvZz4KPC9zdmc%2BCg%3D%3D&label=STABLE%20vLLM%20Version&link=https%3A%2F%2Fgithub.com%2Fvllm-project%2Fvllm)
![Worker Version](https://img.shields.io/github/v/tag/runpod-workers/worker-vllm?style=for-the-badge&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz4KPCEtLSBHZW5lcmF0b3I6IEFkb2JlIElsbHVzdHJhdG9yIDI2LjUuMywgU1ZHIEV4cG9ydCBQbHVnLUluIC4gU1ZHIFZlcnNpb246IDYuMDAgQnVpbGQgMCkgIC0tPgo8c3ZnIHZlcnNpb249IjEuMSIgaWQ9IkxheWVyXzEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgeG1sbnM6eGxpbms9Imh0dHA6Ly93d3cudzMub3JnLzE5OTkveGxpbmsiIHg9IjBweCIgeT0iMHB4IgoJIHZpZXdCb3g9IjAgMCAyMDAwIDIwMDAiIHN0eWxlPSJlbmFibGUtYmFja2dyb3VuZDpuZXcgMCAwIDIwMDAgMjAwMDsiIHhtbDpzcGFjZT0icHJlc2VydmUiPgo8c3R5bGUgdHlwZT0idGV4dC9jc3MiPgoJLnN0MHtmaWxsOiM2NzNBQjc7fQo8L3N0eWxlPgo8Zz4KCTxnPgoJCTxwYXRoIGNsYXNzPSJzdDAiIGQ9Ik0xMDE3Ljk1LDcxMS4wNGMtNC4yMiwyLjM2LTkuMTgsMy4wMS0xMy44NiwxLjgyTDM4Ni4xNyw1NTUuM2MtNDEuNzItMTAuNzYtODYuMDItMC42My0xMTYuNiwyOS43MwoJCQlsLTEuNCwxLjM5Yy0zNS45MiwzNS42NS0yNy41NSw5NS44LDE2Ljc0LDEyMC4zbDU4NC4zMiwzMjQuMjNjMzEuMzYsMTcuNCw1MC44Miw1MC40NSw1MC44Miw4Ni4zMnY4MDYuNzYKCQkJYzAsMzUuNDktMzguNDEsNTcuNjctNjkuMTUsMzkuOTRsLTcwMy4xNS00MDUuNjRjLTIzLjYtMTMuNjEtMzguMTMtMzguNzgtMzguMTMtNjYuMDJWNjY2LjYzYzAtODcuMjQsNDYuNDUtMTY3Ljg5LDEyMS45Mi0yMTEuNjYKCQkJTDkzMy44NSw0Mi4xNWMyMy40OC0xMy44LDUxLjQ3LTE3LjcsNzcuODMtMTAuODRsNzQ1LjcxLDE5NC4xYzMxLjUzLDguMjEsMzYuOTksNTAuNjUsOC41Niw2Ni41N0wxMDE3Ljk1LDcxMS4wNHoiLz4KCQk8cGF0aCBjbGFzcz0ic3QwIiBkPSJNMTUyNy43NSw1MzYuMzhsMTI4Ljg5LTc5LjYzbDE4OS45MiwxMDkuMTdjMjcuMjQsMTUuNjYsNDMuOTcsNDQuNzMsNDMuODIsNzYuMTVsLTQsODU3LjYKCQkJYy0wLjExLDI0LjM5LTEzLjE1LDQ2Ljg5LTM0LjI1LDU5LjExbC03MDEuNzUsNDA2LjYxYy0zMi4zLDE4LjcxLTcyLjc0LTQuNTktNzIuNzQtNDEuOTJ2LTc5Ny40MwoJCQljMC0zOC45OCwyMS4wNi03NC45MSw1NS4wNy05My45Nmw1OTAuMTctMzMwLjUzYzE4LjIzLTEwLjIxLDE4LjY1LTM2LjMsMC43NS00Ny4wOUwxNTI3Ljc1LDUzNi4zOHoiLz4KCQk8cGF0aCBjbGFzcz0ic3QwIiBkPSJNMTUyNC4wMSw2NjUuOTEiLz4KCTwvZz4KPC9nPgo8L3N2Zz4K&logoColor=%23ffffff&label=STABLE%20Worker%20Version&color=%23673ab7)
![vLLM Version](https://img.shields.io/badge/dynamic/yaml?url=https%3A%2F%2Fraw.githubusercontent.com%2Frunpod-workers%2Fworker-vllm%2Fmain%2Fvllm-base-image%2Fvllm-metadata.yml&query=%24.dev_version&style=for-the-badge&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPCFET0NUWVBFIHN2ZyBQVUJMSUMgIi0vL1czQy8vRFREIFNWRyAxLjEvL0VOIiAiaHR0cDovL3d3dy53My5vcmcvR3JhcGhpY3MvU1ZHLzEuMS9EVEQvc3ZnMTEuZHRkIj4KPHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZlcnNpb249IjEuMSIgd2lkdGg9IjU1cHgiIGhlaWdodD0iNTZweCIgc3R5bGU9InNoYXBlLXJlbmRlcmluZzpnZW9tZXRyaWNQcmVjaXNpb247IHRleHQtcmVuZGVyaW5nOmdlb21ldHJpY1ByZWNpc2lvbjsgaW1hZ2UtcmVuZGVyaW5nOm9wdGltaXplUXVhbGl0eTsgZmlsbC1ydWxlOmV2ZW5vZGQ7IGNsaXAtcnVsZTpldmVub2RkIiB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayI%2BCjxnPjxwYXRoIHN0eWxlPSJvcGFjaXR5OjEiIGZpbGw9IiMzN2E0ZmUiIGQ9Ik0gNTEuNSwwLjUgQyA0Ni41ODIyLDE4LjA4MzggNDEuOTE1NiwzNS43NTA1IDM3LjUsNTMuNUMgMzIuMTY2Nyw1My41IDI2LjgzMzMsNTMuNSAyMS41LDUzLjVDIDIwLjgzMzMsNTMuNSAyMC41LDUzLjE2NjcgMjAuNSw1Mi41QyAyMS4zMzgyLDUyLjE1ODMgMjEuNjcxNiw1MS40OTE2IDIxLjUsNTAuNUMgMjIuMjIyOSw0Ni44NTU1IDIzLjIyMjksNDMuMTg4OSAyNC41LDM5LjVDIDI0LjY5MTcsMzYuMzk5MiAyNS4zNTg0LDMzLjM5OTIgMjYuNSwzMC41QyAyNi4yOTA3LDI5LjkxNCAyNS45NTc0LDI5LjQxNCAyNS41LDI5QyAyNy40NDE0LDI3LjE4NDEgMjguMTA4MSwyNS4xODQxIDI3LjUsMjNDIDI5LjI0MTUsMTguNTM4NyAzMC45MDgyLDE0LjAzODcgMzIuNSw5LjVDIDM4Ljc3NTcsNi4xOTM1OCA0NS4xMDkxLDMuMTkzNTggNTEuNSwwLjUgWiIvPjwvZz4KPGc%2BPHBhdGggc3R5bGU9Im9wYWNpdHk6MC45ODQiIGZpbGw9IiNmY2I3MWQiIGQ9Ik0gMjIuNSwxMi41IEMgMjEuNTA0NiwyNC45ODkgMjEuMTcxMywzNy42NTU3IDIxLjUsNTAuNUMgMjEuNjcxNiw1MS40OTE2IDIxLjMzODIsNTIuMTU4MyAyMC41LDUyLjVDIDEzLjAzMTEsMzkuMjI4NyA2LjM2NDQxLDI1LjU2MjEgMC41LDExLjVDIDguMDE5MDUsMTEuMTc1IDE1LjM1MjQsMTEuNTA4NCAyMi41LDEyLjUgWiIvPjwvZz4KPGc%2BPHBhdGggc3R5bGU9Im9wYWNpdHk6MC4wMiIgZmlsbD0iI2Q3ZGZlOCIgZD0iTSAyMi41LDEyLjUgQyAyMy4xNjY3LDIxLjUgMjMuODMzMywzMC41IDI0LjUsMzkuNUMgMjMuMjIyOSw0My4xODg5IDIyLjIyMjksNDYuODU1NSAyMS41LDUwLjVDIDIxLjE3MTMsMzcuNjU1NyAyMS41MDQ2LDI0Ljk4OSAyMi41LDEyLjUgWiIvPjwvZz4KPGc%2BPHBhdGggc3R5bGU9Im9wYWNpdHk6MC43NTMiIGZpbGw9IiNjZmQ2ZGQiIGQ9Ik0gNTEuNSwwLjUgQyA1Mi42MTI5LDEuOTQ2MzkgNTIuNzc5NiwzLjYxMzA1IDUyLDUuNUMgNDcuODAzNiwyMi4yODg3IDQzLjMwMzYsMzguOTU1MyAzOC41LDU1LjVDIDMyLjUsNTUuNSAyNi41LDU1LjUgMjAuNSw1NS41QyAyMC44MzMzLDU0LjgzMzMgMjEuMTY2Nyw1NC4xNjY3IDIxLjUsNTMuNUMgMjYuODMzMyw1My41IDMyLjE2NjcsNTMuNSAzNy41LDUzLjVDIDQxLjkxNTYsMzUuNzUwNSA0Ni41ODIyLDE4LjA4MzggNTEuNSwwLjUgWiIvPjwvZz4KPC9zdmc%2BCg%3D%3D&label=DEV%20vLLM%20Version%20&link=https%3A%2F%2Fgithub.com%2Fvllm-project%2Fvllm)\
![Docker Pulls](https://img.shields.io/docker/pulls/runpod/worker-vllm?style=for-the-badge&logo=docker&label=Docker%20Pulls&link=https%3A%2F%2Fhub.docker.com%2Frepository%2Fdocker%2Frunpod%2Fworker-vllm%2Fgeneral)
![Docker Pulls](https://img.shields.io/docker/pulls/runpod/worker-vllm?style=for-the-badge&logo=docker&label=Docker%20Pulls&link=https%3A%2F%2Fhub.docker.com%2Frepository%2Fdocker%2Frunpod%2Fworker-vllm%2Fgeneral) -->
<!--
![Docker Automatic Build](https://img.shields.io/github/actions/workflow/status/runpod-workers/worker-vllm/docker-build-release.yml?style=flat&label=BUILD) -->

Expand All @@ -18,16 +18,15 @@ Deploy OpenAI-Compatible Blazing-Fast LLM Endpoints powered by the [vLLM](https:
### 1. UI for Deploying vLLM Worker on RunPod console:
![Demo of Deploying vLLM Worker on RunPod console with new UI](media/ui_demo.gif)

### 2. Worker vLLM `1.0.0preview` with vLLM `0.4.2` now available under `stable` tags
Update 1.0.0preview is now available, use the image tag `runpod/worker-vllm:dev-cuda12.1.0` or `runpod/worker-vllm:dev-cuda11.8.0`.
### 2. Worker vLLM `1.0.0` with vLLM `0.4.2` now available under `stable` tags
Update 1.0.0 is now available, use the image tag `runpod/worker-vllm:stable-cuda12.1.0` or `runpod/worker-vllm:stable-cuda11.8.0`.

**Main Changes:**
- vLLM was updated from version `0.3.3` to `0.4.2`, adding compatibility for Llama 3 and other models, as well as increasing performance.
### 3. OpenAI-Compatible [Embedding Worker](https://github.com/runpod-workers/worker-infinity-embedding) Released
Deploy your own OpenAI-compatible Serverless Endpoint on RunPod with multiple embedding models and fast inference for RAG and more!

We will soon be adding more features from the updates, such as multi-LoRA, multi-modality, and more.


### 3. Caching Accross RunPod Machines
### 4. Caching Accross RunPod Machines
Worker vLLM is now cached on all RunPod machines, resulting in near-instant deployment! Previously, downloading and extracting the image took 3-5 minutes on average.


Expand Down
2 changes: 1 addition & 1 deletion docker-bake.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ variable "REPOSITORY" {
}

variable "BASE_IMAGE_VERSION" {
default = "1.0.0preview"
default = "1.0.0"
}

group "all" {
Expand Down
3 changes: 1 addition & 2 deletions vllm-base-image/vllm-metadata.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,2 @@
version: '0.3.3'
version: '0.4.2'
dev_version: '0.4.2'
worker_dev_version: '1.0.0preview'

0 comments on commit c8458fe

Please sign in to comment.