-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add whisper quadlet, move whisper model-service, update docs #83
Merged
Merged
Changes from 1 commit
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
dd1fc7e
add whisper quadlet & update docs
sallyom 205c9d3
Update audio-to-text/README.md
sallyom 4369d01
Update audio-to-text/README.md
sallyom da9dc79
Update audio-to-text/README.md
sallyom d0a85fd
Update audio-to-text/README.md
sallyom 664d323
Update README.md
sallyom File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
# Audio to Text Application | ||
|
||
This sample application is a simple recipe to transcribe an audio file. | ||
This provides a simple recipe to help developers start building out their own custom LLM enabled | ||
audio-to-text applications. It consists of two main components; the Model Service and the AI Application. | ||
|
||
There are a few options today for local Model Serving, but this recipe will use [`whisper-cpp`](https://github.com/ggerganov/whisper.cpp.git) | ||
and their OpenAI compatible Model Service. There is a Containerfile provided that can be used to build this Model Service within the repo, | ||
[`model_servers/whispercpp/Containerfile`](/model_servers/whispercpp/Containerfile). | ||
|
||
Our AI Application will connect to our Model Service via it's OpenAI compatible API. | ||
sallyom marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
<p align="center"> | ||
<img src="../assets/whisper.png" width="70%"> | ||
</p> | ||
|
||
# Build the Application | ||
|
||
In order to build this application we will need a model, a Model Service and an AI Application. | ||
|
||
* [Download a model](#download-a-model) | ||
* [Build the Model Service](#build-the-model-service) | ||
* [Deploy the Model Service](#deploy-the-model-service) | ||
* [Build the AI Application](#build-the-ai-application) | ||
* [Deploy the AI Application](#deploy-the-ai-application) | ||
* [Interact with the AI Application](#interact-with-the-ai-application) | ||
* [Input audio files](#input-audio-files) | ||
|
||
### Download a model | ||
|
||
If you are just getting started, we recommend using [ggerganov/whisper.cpp](https://huggingface.co/ggerganov/whisper.cpp). | ||
This is a well performant mid-sized model with an apache-2.0 license. | ||
sallyom marked this conversation as resolved.
Show resolved
Hide resolved
|
||
It's simple to download a pre-converted whisper model from [huggingface.co](https://huggingface.co) | ||
here: https://huggingface.co/ggerganov/whisper.cpp. There are a number of options, but we recommend to start with `ggml-small.bin`. | ||
|
||
The recommended model can be downloaded using the code snippet below: | ||
|
||
```bash | ||
cd models | ||
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin | ||
cd ../ | ||
``` | ||
|
||
_A full list of supported open models is forthcoming._ | ||
|
||
|
||
### Build the Model Service | ||
|
||
The Model Service can be built from the root directory with the following code snippet: | ||
|
||
```bash | ||
cd model_servers/whispercpp | ||
podman build -t whispercppserver . | ||
``` | ||
|
||
### Deploy the Model Service | ||
|
||
The local Model Service relies on a volume mount to the localhost to access the model files. You can start your local Model Service using the following podman command: | ||
``` | ||
podman run --rm -it \ | ||
-p 8001:8001 \ | ||
-v Local/path/to/locallm/models:/locallm/models \ | ||
-e MODEL_PATH=models/<model-filename> \ | ||
-e HOST=0.0.0.0 \ | ||
-e PORT=8001 \ | ||
whispercppserver | ||
``` | ||
|
||
### Build the AI Application | ||
|
||
Now that the Model Service is running we want to build and deploy our AI Application. Use the provided Containerfile to build the AI Application | ||
image from the `audio-to-text/` directory. | ||
|
||
```bash | ||
cd audio-to-text | ||
podman build -t audio-to-text . -f builds/Containerfile | ||
``` | ||
### Deploy the AI Application | ||
|
||
Make sure the Model Service is up and running before starting this container image. | ||
When starting the AI Application container image we need to direct it to the correct `MODEL_SERVICE_ENDPOINT`. | ||
This could be any appropriately hosted Model Service (running locally or in the cloud) using an OpenAI compatible API. | ||
sallyom marked this conversation as resolved.
Show resolved
Hide resolved
|
||
The following podman command can be used to run your AI Application: | ||
|
||
```bash | ||
podman run --rm -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://0.0.0.0:8001/inference audio-to-text | ||
``` | ||
|
||
### Interact with the AI Application | ||
|
||
Once the streamlit application is up and running, you should be able to access it at `http://localhost:8501`. | ||
From here, you can upload audio files from your local machine and translate the audio files as shown below. | ||
|
||
Everything should now be up an running with the chat application available at [`http://localhost:8501`](http://localhost:8501). | ||
sallyom marked this conversation as resolved.
Show resolved
Hide resolved
|
||
By using this recipe and getting this starting point established, | ||
users should now have an easier time customizing and building their own LLM enabled chatbot applications. | ||
sallyom marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### Input audio files | ||
|
||
Whisper.cpp requires as an input 16-bit WAV audio files. | ||
To convert your input audio files to 16-bit WAV format you can use `ffmpeg` like this: | ||
|
||
```bash | ||
ffmpeg -i <input.mp3> -ar 16000 -ac 1 -c:a pcm_s16le <output.wav> | ||
``` |
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
### Run audio-text locally as a podman pod | ||
|
||
There are pre-built images and a pod definition to run this audio-to-text example application. | ||
This sample converts an audio waveform (.wav) file to text. | ||
|
||
To run locally, | ||
|
||
```bash | ||
podman kube play ./quadlet/audio-to-text.yaml | ||
``` | ||
To monitor locally, | ||
|
||
```bash | ||
podman pod list | ||
podman ps | ||
podman logs <name of container from the above> | ||
``` | ||
|
||
The application should be acessible at `http://localhost:8501`. It will take a few minutes for the model to load. | ||
|
||
### Run audio-text as a systemd service | ||
|
||
```bash | ||
cp audio-text.yaml /etc/containers/systemd/audio-text.yaml | ||
cp audio-text.kube.example /etc/containers/audio-text.kube | ||
cp audio-text.image /etc/containers/audio-text.image | ||
/usr/libexec/podman/quadlet --dryrun (optional) | ||
systemctl daemon-reload | ||
systemctl start audio-text | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[Install] | ||
WantedBy=audio-text.service | ||
|
||
[Image] | ||
Image=quay.io/redhat-et/locallm-whisper-ggml-small:latest | ||
Image=quay.io/redhat-et/locallm-whisper-service:latest | ||
Image=quay.io/redhat-et/locallm-audio-to-text:latest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
[Unit] | ||
Description=Python script to run against downloaded LLM | ||
Documentation=man:podman-generate-systemd(1) | ||
Wants=network-online.target | ||
After=network-online.target | ||
RequiresMountsFor=%t/containers | ||
|
||
[Kube] | ||
# Point to the yaml file in the same directory | ||
Yaml=audio-text.yaml | ||
|
||
[Service] | ||
Restart=always | ||
|
||
[Install] | ||
WantedBy=default.target |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
labels: | ||
app: audio-to-text | ||
name: audio-to-text | ||
spec: | ||
initContainers: | ||
- name: model-file | ||
image: quay.io/redhat-et/locallm-whisper-ggml-small:latest | ||
command: ['/usr/bin/install', "/model/ggml-small.bin", "/shared/"] | ||
volumeMounts: | ||
- name: model-file | ||
mountPath: /shared | ||
containers: | ||
- env: | ||
- name: MODEL_SERVICE_ENDPOINT | ||
value: http://0.0.0.0:8001/inference | ||
image: quay.io/redhat-et/locallm-audio-to-text:latest | ||
name: audio-to-text | ||
ports: | ||
- containerPort: 8501 | ||
hostPort: 8501 | ||
securityContext: | ||
runAsNonRoot: true | ||
- env: | ||
- name: HOST | ||
value: 0.0.0.0 | ||
- name: PORT | ||
value: 8001 | ||
- name: MODEL_PATH | ||
value: /model/ggml-small.bin | ||
image: quay.io/redhat-et/locallm-whisper-service:latest | ||
name: whisper-model-service | ||
ports: | ||
- containerPort: 8001 | ||
hostPort: 8001 | ||
securityContext: | ||
runAsNonRoot: true | ||
volumeMounts: | ||
- name: model-file | ||
mountPath: /model | ||
volumes: | ||
- name: model-file | ||
emptyDir: {} |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
## Whisper | ||
|
||
Whisper models are useful for converting audio files to text. The sample application [audio-to-text](../audio-to-text/README.md) | ||
describes how to run an inference application. This document describes how to build a service for a Whisper model. | ||
|
||
### Build model service | ||
|
||
To build a Whisper model service container image from this directory, | ||
|
||
```bash | ||
podman build -t whisper:image . | ||
``` | ||
|
||
### Download Whisper model | ||
|
||
You can to download the model from HuggingFace. There are various Whisper models available which vary in size and can be found | ||
[here](https://huggingface.co/ggerganov/whisper.cpp). We will be using the `small` model which is about 466 MB. | ||
|
||
- **small** | ||
- Download URL: [https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin](https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin) | ||
|
||
```bash | ||
cd ../models | ||
wget --no-config --quiet --show-progress -O ggml-small.bin <Download URL> | ||
cd ../ | ||
``` | ||
|
||
### Deploy Model Service | ||
|
||
Deploy the LLM and volume mount the model of choice. | ||
Here, we are mounting the `ggml-small.bin` model as downloaded from above. | ||
|
||
```bash | ||
# Note: the :Z may need to be omitted from the model volume mount if not running on Linux | ||
|
||
podman run --rm -it \ | ||
-p 8001:8001 \ | ||
-v /local/path/to/locallm/models/ggml-small.bin:/models/ggml-small.bin:Z,ro \ | ||
-e HOST=0.0.0.0 \ | ||
-e MODEL_PATH=/models/ggml-small.bin \ | ||
-e PORT=8001 \ | ||
whisper:image | ||
``` | ||
|
||
By default, a sample `jfk.wav` file is included in the whisper image. This can be used to test with. | ||
The environment variable `AUDIO_FILE`, can be passed with your own audio file to override the default `/app/jfk.wav` file within the whisper image. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#! bin/bash | ||
|
||
./server -tr --model ${MODEL_PATH} --host ${HOST:=0.0.0.0} --port ${PORT:=8001} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this model uses an OpenAI compatible API.
https://platform.openai.com/docs/api-reference/audio/createTranscription?lang=curl
We can probably just say, "and their included model server"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done