Phi-3 LLM Inference Server

Simple REST API interface for the Microsoft Phi-3 large language models (LLMs). Served using FastAPI, with inputs/outputs mirroring same interface as used by the Huggingface Text Generation Inference Server.

Install

Recommend Python 3.10 or above. Run:

pip install -r requirements.txt

Start the Inference Server

To start the server, run:

uvicorn app:app --host 0.0.0.0 --port 8000

Usage

To send an inference request:

curl -X POST "http://0.0.0.0:8000/generate" \
    -H "Content-Type: application/json" \
    -d '{
        "inputs": "<|user|>\nHow old is the universe? <|end|>\n<|assistant|>",
        "parameters": {
            "best_of": 1,
            "decoder_input_details": false,
            "details": true,
            "do_sample": true,
            "frequency_penalty": 0.0,
            "grammar": null,
            "max_new_tokens": 20,
            "repetition_penalty": 1.0,
            "return_full_text": false,
            "seed": null,
            "stop": [
                "\n\n"
            ],
            "temperature": 0.2,
            "top_k": 10,
            "top_n_tokens": 5,
            "top_p": 0.95,
            "truncate": null,
            "typical_p": 0.95,
            "watermark": true
        }
    }'

Which should give you a response back like:

{"generated_text": "The universe is approximately 13.8 billion years old."}

You can also check the server status by sending a GET request to /health:

curl "http://0.0.0.0:8000/health"

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
.rsyncignore		.rsyncignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phi-3 LLM Inference Server

Install

Start the Inference Server

Usage

About

Releases

Packages

Languages

License

dcbark01/fastphi

Folders and files

Latest commit

History

Repository files navigation

Phi-3 LLM Inference Server

Install

Start the Inference Server

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages