Baddie helps you talk to your documents. Ask questions in natural language and get context aware answers. I've created this as a demo project to toy around with LLMs on amazon bedrock, planning to evolve it to something useful. Under the hood, it uses
- FM for chat : Meta Llama3-8B on Amazon bedrock (as of now)
- FM for generating embeddings : Amazon Titan Multimodal Embeddings G1
- PostgreSQL and PGvector for storing/retrieving embeddings And everything tied together with langchain and streamlit.
Two steps
-
Create a python3 virtualenv and run.
pip install -r requirements.txt
-
Add required values to keys in
.env
file. -
Finally, run it using below command.
streamlit run streamlit_app.py --server.port=80 --server.address=0.0.0.0
Add right values for env_file
in docker-compose.yaml
and then run below command.
docker-compose up -d
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. Few things I have in mind next :
- Use langchain classes for holding chat context
- Vector indexing
- Session wise doc storage (currently its global)
Other chunkier improvements
- Auth and hosting
- Async ingestion pipeline for docs
- Support other sources for data
Found some good demos and docs while building this, I'll add a references section for those.