FastAPI-BitNet

This project uses a combination of Uvicorn, FastAPI (Python) and Docker to provide a reliable REST API for testing Microsoft's BitNet out locally!

It supports running the inference framework, running BitNet model benchmarks and calculating BitNet model perplexity values.

It's offers the same functionality as the Electron-BitNet project, however it does so through a REST API which devs/researchers can use to automate testing/benchmarking of 1-bit BitNet models!

Setup instructions

Install Conda: https://anaconda.org/anaconda/conda

Initialize the python environment:

conda init
conda create -n bitnet python=3.9
conda activate bitnet

Install the Huggingface-CLI tool to download the models:

pip install -U "huggingface_hub[cli]"

Download one/many of the 1-bit models from Huggingface below:

huggingface-cli download 1bitLLM/bitnet_b1_58-large --local-dir app/models/bitnet_b1_58-large
huggingface-cli download 1bitLLM/bitnet_b1_58-3B --local-dir app/models/bitnet_b1_58-3B
huggingface-cli download HF1BitLLM/Llama3-8B-1.58-100B-tokens --local-dir app/models/Llama3-8B-1.58-100B-tokens

Build the docker image:

docker build -t fastapi_bitnet .

Run the docker image:

docker run -d --name ai_container -p 8080:8080 fastapi_bitnet

Once it's running navigate to http://127.0.0.1:8080/docs

Note:

If seeking to use this in production, make sure to extend the docker image with additional authentication security steps. In its current state it's intended for use locally.

Building the docker file image requires upwards of 40GB RAM for Llama3-8B-1.58-100B-tokens, if you have less than 64GB RAM you will probably run into issues.

The Dockerfile deletes the larger f32 files, so as to reduce the time to build the docker image file, you'll need to comment out the find /code/models/.... lines if you want the larger f32 files included.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastAPI-BitNet

Setup instructions

About

Languages

License

grctest/FastAPI-BitNet

Folders and files

Latest commit

History

Repository files navigation

FastAPI-BitNet

Setup instructions

About

Topics

Resources

License

Stars

Watchers

Forks

Languages