A Retrieval-Augmented Generation (RAG) application that enables intelligent document analysis and question answering using Llama 3.2. Built with Streamlit, Langchain, and Ollama.
- Multi-format Document Support: Process PDF, DOCX, TXT, MD, and CSV files with intelligent text extraction.
- Advanced Text Processing: Automatic text chunking and embedding using LangChain.
- Vector Search: FAISS-powered similarity search for accurate document retrieval.
- Configurable Parameters: Adjustable model settings, chunk sizes, and temperature.
- Comprehensive Metadata Tracking: Detailed processing statistics and response metrics.
- Interactive UI: Clean Streamlit interface with chat history and document statistics.
- Python 3.8+
- Ollama installed and running locally
- Required Python packages:
ollama
streamlit
chardet
PyPDF2
python-docx
langchain
faiss-cpu
- Clone the repository:
git clone <repository-url> cd LLM-Powered-Document-Q-A-with-Llama-3.2
- Install dependencies:
pip install -r requirements.txt
- Ensure Ollama is running locally on port 11434.
🚀 Quick Start
Install Ollama and pull Llama 3.2:
ollama pull llama3.2
Install dependencies:
pip install -r requirements.txt
Run the app:
streamlit run app.py
- Start the application:
streamlit run app.py
- Access the web interface at http://localhost:8501.
- Configure the application using the sidebar:
- Select the LLM model.
- Adjust temperature and chunk size.
- Upload documents or paste text directly.
- Enter questions and receive AI-powered responses based on your documents.
-
DocumentProcessor
- Handles multiple document formats.
- Extracts text with metadata.
- Provides error handling and encoding detection.
-
EnhancedRAGApplication
- Manages document indexing and retrieval.
- Handles embedding generation.
- Coordinates response generation.
- Tracks performance metrics.
-
Streamlit UI
- Provides interactive interface.
- Displays document statistics.
- Maintains chat history.
- Shows response metadata.
model_name
: LLM model selection (default: "llama3.2")embedding_model
: Model for generating embeddingschunk_size
: Text chunk size (default: 500)chunk_overlap
: Overlap between chunks (default: 50)temperature
: Response creativity (0.0-1.0)max_tokens
: Maximum response length
The application includes comprehensive error handling:
- Document processing validation
- Encoding detection and fallback
- Vector store initialization checks
- Response generation monitoring
Detailed logging is implemented throughout:
- Document processing events
- Vector store operations
- Response generation metrics
- Error tracking
- Document chunking is optimized for balance between context and performance.
- Vector similarity search uses efficient FAISS indexing.
- Response generation includes context length management.
- UI updates are optimized for responsiveness.
- Fork the repository.
- Create a feature branch.
- Implement changes with tests.
- Submit a pull request.
MIT License
- Ollama team for the LLM integration.
- Streamlit for the UI framework.
- LangChain for document processing utilities.
- FAISS for vector search capabilities.