Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swap granite for mistral cleanup + model and recipe housekeeping #350

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/chatbot.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ jobs:

- name: Download model
working-directory: ./models
run: make download-model-mistral
run: make download-model-granite

- name: Run Functional Tests
shell: bash
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/codegen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ jobs:

- name: Download model
working-directory: ./models
run: make download-model-mistral
run: make download-model-mistral-code

- name: Run Functional Tests
shell: bash
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/model_image_build_push.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@ jobs:
label: Q4_K_M
url: https://huggingface.co/instructlab/granite-7b-lab-GGUF/resolve/main/granite-7b-lab-Q4_K_M.gguf
platforms: linux/amd64,linux/arm64
- image_name: whisper-small
url: https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
platforms: linux/amd64,linux/arm64
runs-on: ubuntu-latest
permissions:
contents: read
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/model_servers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,13 @@ jobs:
matrix:
include:
- image_name: llamacpp_python
model: mistral
model: granite
flavor: base
directory: llamacpp_python
platforms: linux/amd64,linux/arm64
no_gpu: 1
- image_name: llamacpp_python_cuda
model: mistral
model: granite
flavor: cuda
directory: llamacpp_python
platforms: linux/amd64
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/rag.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ jobs:

- name: Download model
working-directory: ./models
run: make download-model-mistral
run: make download-model-granite

- name: Run Functional Tests
shell: bash
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/summarizer.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ jobs:

- name: Download model
working-directory: ./models
run: make download-model-mistral
run: make download-model-granite

- name: Run Functional Tests
shell: bash
Expand Down
35 changes: 27 additions & 8 deletions ailab-images.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,38 @@
## Images (x86_64, aarch64) currently built from GH Actions in this repository
## Model Server Images (amd64, arm64) currently built from GH Actions in this repository

- quay.io/ai-lab/llamacpp_python:latest
- quay.io/ai-lab/llamacpp_python_cuda:latest
- quay.io/ai-lab/llamacpp_python_vulkan:latest
- quay.io/ai-lab/llamacpp-python-cuda:latest
- quay.io/ai-lab/llamacpp-python-vulkan:latest
- quay.io/redhat-et/locallm-object-detection-server:latest

## Recipe Images (amd64, arm64)
- quay.io/ai-lab/summarizer:latest
- quay.io/ai-lab/chatbot:latest
- quay.io/ai-lab/rag:latest
- quay.io/ai-lab/codegen:latest
- quay.io/ai-lab/chromadb:latest
- quay.io/redhat-et/locallm-object-detection-client:latest
- quay.io/redhat-et/locallm-object-detection-server:latest

## Model Images (x86_64, aarch64)
## Dependency images

Images used in the `Bootc` aspect of this repo or tooling images

- quay.io/ai-lab/nvidia-builder:latest
- quay.io/ai-lab/instructlab-nvidia:latest
- quay.io/ai-lab/nvidia-bootc:latest

- quay.io/ai-lab/chromadb:latest
- quay.io/ai-lab/model-converter:latest

## Model Images (amd64, arm64)

- quay.io/ai-lab/merlinite-7b-lab:latest
- [model download link](https://huggingface.co/instructlab/merlinite-7b-lab-GGUF/resolve/main/merlinite-7b-lab-Q4_K_M.gguf)
- quay.io/ai-lab/granite-7b-lab:latest
- [model download link](https://huggingface.co/instructlab/granite-7b-lab-GGUF/resolve/main/granite-7b-lab-Q4_K_M.gguf)
- quay.io/ai-lab/mistral-7b-instruct:latest
- [model download link](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf)
- quay.io/ai-lab/codellama-7b:latest
- [model download link](https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q4_K_M.gguf)
- quay.io/ai-lab/mistral-7b-code-16k-qlora:latest
- [model download link](https://huggingface.co/TheBloke/Mistral-7B-Code-16K-qlora-GGUF/resolve/main/mistral-7b-code-16k-qlora.Q4_K_M.gguf)
- quay.io/ai-lab/whisper-small:latest
- [model download link](https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin)

4 changes: 2 additions & 2 deletions model_servers/llamacpp_python/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,5 @@ build-vulkan:

.PHONY: download-model-granite # default model
download-model-granite:
cd ../../models && \
make MODEL_NAME=granite-7b-lab-Q4_K_M.gguf MODEL_URL=https://huggingface.co/instructlab/granite-7b-lab-GGUF/resolve/main/granite-7b-lab-Q4_K_M.gguf -f Makefile download-model
cd ../../models/ && \
make download-model-granite
2 changes: 1 addition & 1 deletion model_servers/llamacpp_python/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
IMAGE_NAME = os.environ['IMAGE_NAME']

if not 'MODEL_NAME' in os.environ:
MODEL_NAME = 'mistral-7b-instruct-v0.2.Q4_K_M.gguf'
MODEL_NAME = 'granite-7b-lab-Q4_K_M.gguf'
else:
MODEL_NAME = os.environ['MODEL_NAME']

Expand Down
2 changes: 1 addition & 1 deletion model_servers/whispercpp/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ MODEL_NAME ?= ggml-small.bin
.PHONY: all
all: build download-model-whisper-small run

.PHONY: download-model-whisper-small # small .bin model type testing
.PHONY: download-model-whisper-small
download-model-whisper-small:
cd ../../models && \
make download-model-whisper-small
4 changes: 2 additions & 2 deletions models/Containerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Suggested alternative open AI Models
# https://huggingface.co/instructlab/granite-7b-lab-GGUF/resolve/main/granite-7b-lab-Q4_K_M.gguf
# https://huggingface.co/instructlab/granite-7b-lab-GGUF/resolve/main/granite-7b-lab-Q4_K_M.gguf (Default)
# https://huggingface.co/instructlab/merlinite-7b-lab-GGUF/resolve/main/merlinite-7b-lab-Q4_K_M.gguf
# https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf
# https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf (Default)
# https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
# https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q4_K_M.gguf
# https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
# podman build --build-arg="MODEL_URL=https://..." -t quay.io/yourimage .
Expand Down
10 changes: 7 additions & 3 deletions models/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,19 @@ download-model-granite:
download-model-merlinite:
$(MAKE) MODEL_URL=https://huggingface.co/instructlab/merlinite-7b-lab-GGUF/resolve/main/merlinite-7b-lab-Q4_K_M.gguf MODEL_NAME=merlinite-7b-lab-Q4_K_M.gguf download-model

.PHONY: download-model-whisper-small # small .bin model type testing
.PHONY: download-model-whisper-small
download-model-whisper-small:
$(MAKE) MODEL_NAME=ggml-small.bin MODEL_URL=https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin download-model

.PHONY: download-model-mistral
download-model-mistral:
$(MAKE) MODEL_NAME=mistral-7b-instruct-v0.2.Q4_K_M.gguf MODEL_URL=https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf download-model

.PHONY: download-model-mistral-code
download-model-mistral-code:
$(MAKE) MODEL_NAME=mistral-7b-code-16k-qlora.Q4_K_M.gguf MODEL_URL=https://huggingface.co/TheBloke/Mistral-7B-Code-16K-qlora-GGUF/resolve/main/mistral-7b-code-16k-qlora.Q4_K_M.gguf download-model

.PHONY: clean
clean:
rm -f *tmp
rm -f mistral* whisper* granite* merlinite*
-rm -f *tmp
-rm -f mistral* ggml-* granite* merlinite*
1 change: 1 addition & 0 deletions models/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Want to try one of our tested models? Try one or all of the following:
make download-model-granite
make download-model-merlinite
make download-model-mistral
make download-model-mistral-code
make download-model-whisper-small
```

Expand Down
4 changes: 2 additions & 2 deletions recipes/audio/audio_to_text/bootc/Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ RUN set -eu; mkdir -p /usr/ssh && \
echo ${SSHPUBKEY} > /usr/ssh/root.keys && chmod 0600 /usr/ssh/root.keys

ARG RECIPE=audio-to-text
ARG MODEL_IMAGE=quay.io/ai-lab/mistral-7b-instruct:latest
ARG MODEL_IMAGE=quay.io/ai-lab/whisper-small:latest
ARG APP_IMAGE=quay.io/ai-lab/${RECIPE}:latest
ARG SERVER_IMAGE=quay.io/ai-lab/llamacpp_python:latest
ARG SERVER_IMAGE=quay.io/ai-lab/whispercpp:latest
ARG TARGETARCH

# Add quadlet files to setup system to automatically run AI application on boot
Expand Down
8 changes: 4 additions & 4 deletions recipes/common/Makefile.common
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ REGISTRY_ORG ?= ai-lab
IMAGE_NAME ?= $(REGISTRY_ORG)/${APP}:latest
APP_IMAGE ?= $(REGISTRY)/$(IMAGE_NAME)
CHROMADB_IMAGE ?= $(REGISTRY)/$(REGISTRY_ORG)/chromadb:latest
MODEL_IMAGE ?= $(REGISTRY)/$(REGISTRY_ORG)/mistral-7b-instruct:latest
MODEL_IMAGE ?= $(REGISTRY)/$(REGISTRY_ORG)/granite-7b-lab:latest
SERVER_IMAGE ?= $(REGISTRY)/$(REGISTRY_ORG)/llamacpp_python:latest
SSH_PUBKEY ?= $(shell cat ${HOME}/.ssh/id_rsa.pub;)
BOOTC_IMAGE ?= quay.io/$(REGISTRY_ORG)/${APP}-bootc:latest
Expand Down Expand Up @@ -62,10 +62,10 @@ UNZIP_EXISTS ?= $(shell command -v unzip)
RELATIVE_MODELS_PATH := ?=
RELATIVE_TESTS_PATH := ?=

MISTRAL_MODEL_NAME := mistral-7b-instruct-v0.2.Q4_K_M.gguf
MISTRAL_MODEL_URL := https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
GRANITE_MODEL_NAME := granite-7b-lab-Q4_K_M.gguf
GRANITE_MODEL_URL := https://huggingface.co/instructlab/granite-7b-lab-GGUF/resolve/main/granite-7b-lab-Q4_K_M.gguf

MODEL_NAME ?= $(MISTRAL_MODEL_NAME)
MODEL_NAME ?= $(GRANITE_MODEL_NAME)

.PHONY: install
install::
Expand Down
2 changes: 1 addition & 1 deletion recipes/common/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ used to override defaults for a variety of make targets.
|DISK_TYPE | Disk type to be created by BOOTC_IMAGE_BUILDER | `qcow2` (Options: ami, iso, vmdk, raw) |
|DISK_UID | Disk UID to be specified by BOOTC_IMAGE_BUILDER | `$(shell id -u)` |
|DISK_GID | Disk GID to be specified by BOOTC_IMAGE_BUILDER | `$(shell id -g)` |
|MODEL_IMAGE | AI Model to be used by application | `$(REGISTRY)/$(REGISTRY_ORG)/mistral-7b-instruct:latest`|
|MODEL_IMAGE | AI Model to be used by application | `$(REGISTRY)/$(REGISTRY_ORG)/granite-7b-lab:latest`|
|SERVER_IMAGE | AI Model Server Application | `$(REGISTRY)/$(REGISTRY_ORG)/llamacpp_python:latest` |
|SSH_PUBKEY | SSH Public key preloaded in bootc image. | `$(shell cat ${HOME}/.ssh/id_rsa.pub;)` |
|FROM | Overrides first FROM instruction within Containerfile| `FROM` line defined in the Containerfile |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ RUN set -eu; mkdir -p /usr/ssh && \
echo ${SSHPUBKEY} > /usr/ssh/root.keys && chmod 0600 /usr/ssh/root.keys

ARG RECIPE=chatbot
ARG MODEL_IMAGE=quay.io/ai-lab/mistral-7b-instruct:latest
ARG MODEL_IMAGE=quay.io/ai-lab/granite-7b-lab:latest
ARG APP_IMAGE=quay.io/ai-lab/${RECIPE}:latest
ARG SERVER_IMAGE=quay.io/ai-lab/llamacpp_python:latest
ARG TARGETARCH
Expand Down
1 change: 1 addition & 0 deletions recipes/natural_language_processing/codegen/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ RECIPE_BINARIES_PATH := $(shell realpath ../../common/bin)
RELATIVE_MODELS_PATH := ../../../models
RELATIVE_TESTS_PATH := ../tests
MODEL_IMAGE := quay.io/ai-lab/mistral-7b-code-16k-qlora:latest
MODEL_NAME := mistral-7b-code-16k-qlora.Q4_K_M.gguf
13 changes: 5 additions & 8 deletions recipes/natural_language_processing/rag/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,18 +52,15 @@ In order to build this application we will need two models, a Vector Database, a

### Download models

If you are just getting started, we recommend using [Mistral-7B-Instruct](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1). This is a well
performant mid-sized model with an apache-2.0 license. In order to use it with our Model Service we need it converted
and quantized into the [GGUF format](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md). There are a number of
ways to get a GGUF version of Mistral-7B, but the simplest is to download a pre-converted one from
[huggingface.co](https://huggingface.co) here: https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF.
If you are just getting started, we recommend using [Granite-7B-Lab](https://huggingface.co/instructlab/granite-7b-lab-GGUF). This is a well
performant mid-sized model with an apache-2.0 license that has been quanitzed and served into the [GGUF format](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md).

The recommended model can be downloaded using the code snippet below:

```bash
cd models
wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
cd ../
cd ../../../models
curl -sLO https://huggingface.co/instructlab/granite-7b-lab-GGUF/resolve/main/granite-7b-lab-Q4_K_M.gguf
cd ../recipes/natural_language_processing/rag
```

_A full list of supported open models is forthcoming._
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ RUN set -eu; mkdir -p /usr/ssh && \
echo ${SSHPUBKEY} > /usr/ssh/root.keys && chmod 0600 /usr/ssh/root.keys

ARG RECIPE=rag
ARG MODEL_IMAGE=quay.io/ai-lab/mistral-7b-instruct:latest
ARG MODEL_IMAGE=quay.io/ai-lab/granite-7b-lab:latest
ARG APP_IMAGE=quay.io/ai-lab/${RECIPE}:latest
ARG SERVER_IMAGE=quay.io/ai-lab/llamacpp_python:latest
ARG CHROMADBImage=quay.io/ai-lab/chromadb
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ RUN set -eu; mkdir -p /usr/ssh && \
echo ${SSHPUBKEY} > /usr/ssh/root.keys && chmod 0600 /usr/ssh/root.keys

ARG RECIPE=summarizer
ARG MODEL_IMAGE=quay.io/ai-lab/mistral-7b-instruct:latest
ARG MODEL_IMAGE=quay.io/ai-lab/granite-7b-lab:latest
ARG APP_IMAGE=quay.io/ai-lab/${RECIPE}:latest
ARG SERVER_IMAGE=quay.io/ai-lab/llamacpp_python:latest
ARG TARGETARCH
Expand Down