- Deployed in ECS for faster, containerized deployments compared to EC2.
- Easily adaptable to other compute environments.
- Provides isolated environments, ensuring consistent deployments across different setups.
- Front-End Framework: Streamlit chosen for ease of building data applications for data analysts and data engineers.
- Python Implementation: Python framework used to build the front end (Streamlit).
- Time-Saving Components: Built-in Streamlit components reduced the need for custom UI coding.
- AWS as the Cloud Provider: Chosen for organizational support, team experience, and sandbox availability.
- Public Subnets: Default VPC and subnet configurations for quick setup, though future segregation is planned.
- Cost Considerations: Focused on maintaining a consistent environment while minimizing costs.
- Postgres: Selected for its vectorization capabilities and full-text search support, critical for generating accurate SQL query results.
- Initial configuration placed the database in a public subnet for simplicity.
- Amazon Bedrock Model (Claude Sonnet 3.5): Chosen for fast response times and strong code generation capabilities, essential for generating SQL queries.
- LangChain Usage: Two chains interact with Bedrock:
- Convert user queries into SQL queries.
- Convert SQL results into natural language responses.
- Prompt Engineering: Ensures SQL queries or natural language responses adhere to rules for accurate output.
- Query Execution: Queries executed against Postgres, with results fed back for natural language conversion.
- Memory Store: Make memory non-local (Database Persistence) to retain query history across sessions.
- Avoid Hard-Coded Configurations: Ensure easier updates and maintainability (Prompting rules).
- Security Improvements: Privatize (Private Subnets) the database and adopt best practices.