Below is a template README.md for your application. Adjust the content as necessary to match your application's specifics, including setup instructions, dependencies, and usage.
The Language Model Flask Service is a Flask-based application designed to provide an interface for interacting with language models, specifically tailored for mission engineering expert analysis. It utilizes the langchain_community.embeddings.HuggingFaceEmbeddings
to leverage state-of-the-art language models for generating insights and answering queries within a specified domain.
- Query Handling: Process queries using advanced language models.
- Insight Generation: Generate insights based on the context provided to the system.
- Flexible Deployment: Ready for deployment in both internet-connected and air-gapped (offline) environments.
- Python 3.9+
- Flask
- Requests (for environments with internet access)
- Additional Python packages as specified in
requirements.txt
Ensure you have Python 3.9 or newer installed on your system.
-
Clone the Repository
git clone <repository-url> cd llm_flask_service
-
Setup Python Environment (Optional)
It's recommended to use a virtual environment:
python3 -m venv venv source venv/bin/activate
-
Install Dependencies
pip install -r requirements.txt
-
Environment Configuration
For air-gapped systems, ensure all necessary models and dependencies are pre-downloaded and available locally. Set environment variables accordingly:
export SENTENCE_TRANSFORMERS_HOME=/path/to/your/cache
-
Running the Application
Start the Flask application by running:
flask run --host=0.0.0.0
For production environments, consider deploying with a WSGI server like Gunicorn.
After starting the application, you can send queries to your Flask service via HTTP POST requests:
curl -X POST http://localhost:5000/query -H "Content-Type: application/json" -d "{\"query\":\"Your query here\", \"context\": \"Optional context here\"}"
After starting the application, you can interact with the Flask service by sending queries via HTTP POST requests. Initial queries to start a conversation do not require a qhid
(Query History ID). The response to an initial query will include a qhid
that uniquely identifies the query thread. To continue the conversation and maintain context, include this qhid
in subsequent queries.
For the first query in a conversation thread, you only need to provide the query
and an optional context
. Here is an example command:
curl -X POST http://localhost:5000/query \
-H "Content-Type: application/json" \
-d '{"query": "What are the most critical design considerations for reducing the radar cross-section of an aircraft?", "context": "You are developing a new type of aircraft designed to minimize radar cross-section and maximize fuel efficiency."}'
Replace http://localhost:5000/query
with the appropriate URL if deployed on a different host or port.
The response to your initial query will include a qhid
along with the answer. Here is a typical response:
{
"qhid": "eff7b2ab-c41d-4db9-bc5d-cc37b60d20ec",
"response": "Response text with insights on reducing the radar cross-section of an aircraft..."
}
To continue the conversation and ensure the context is maintained, include the qhid
received in the response of your previous query in your next request. Here is an example of how to continue the conversation:
curl -X POST http://localhost:5000/query \
-H "Content-Type: application/json" \
-d '{"query": "Can you provide more details on shape optimization?", "context": "Continuing from our previous discussion on aircraft design.", "qhid": "eff7b2ab-c41d-4db9-bc5d-cc37b60d20ec"}'
This process allows for a coherent and context-aware conversation with the service, leveraging the power of language models to generate informative and relevant responses based on the ongoing discussion thread.
docker build -t ragllm2 .
docker run -p 5000:5000 -p 8888:8888 ragllm2
Contributions are welcome! Please read our contributing guidelines for how to propose updates or improvements.
This project is licensed under the MIT License - see the LICENSE file for details.
Ensure you fill in any placeholders (like <repository-url>
) with actual information relevant to your project. Adjust any instructions based on the specific needs and configurations of your application.