Skip to content

Latest commit

 

History

History
19 lines (11 loc) · 931 Bytes

README.md

File metadata and controls

19 lines (11 loc) · 931 Bytes

LermoAI Support for LLaMA.cpp HTTP Server

LermoAI supports the LLaMA.cpp HTTP Server, allowing you to deploy your own models seamlessly. This server makes it easy to host and manage machine learning models using the LLaMA.cpp framework.

For more detailed information, refer to the official documentation.

Example Dockerfile

Here is an example Dockerfile for setting up the LLaMA.cpp HTTP Server. This example uses the CPU, but you can modify the Dockerfile to use a GPU if needed.

Dockerfile Examples

  • Dockerfile.llama
  • Dockerfile.mistral

Running the Dockerfile

You can run the Dockerfile on your local machine. However, for a more scalable and accessible solution, we recommend using HuggingFace Space to host your server with Docker. HuggingFace Space documentation.