You own little LLM in your matrix chatroom.
This project is split in two parts: the client and the server.
The server simply downloads an LLM and starts a llama-cpp-python server (which mimics an openai server).
The client connects to the matrix server and queries the llama-cpp-python server to create matrix messages.