This repository provides an end-to-end solution for deploying a FastAPI application that generates sentence embeddings using SentenceTransformers. The infrastructure is defined using AWS CDK and includes:
- AWS ECS Fargate: For running the containerized FastAPI application.
- AWS CodePipeline & CodeBuild: For continuous integration and deployment.
- AWS ECR: For storing Docker images.
- AWS VPC and Networking Components: For secure and scalable networking.
- FastAPI Application: A simple API that accepts text inputs and returns their embeddings using a pre-trained SentenceTransformer model.
- Docker Container: The application is containerized using Docker.
- AWS ECR (Elastic Container Registry): Stores the Docker images.
- AWS ECS Fargate: Runs the Docker container in a serverless compute environment.
- AWS CodePipeline & CodeBuild: Automates the build, test, and deployment of the application.
- AWS CDK (Cloud Development Kit): Infrastructure is defined as code using TypeScript.
- AWS Account: An active AWS account with permissions to create the resources.
- AWS CLI: Installed and configured with your AWS credentials.
- Node.js: Version 14 or later.
- AWS CDK: Installed globally (
npm install -g aws-cdk
). - Docker: Installed and running.
- GitHub Account: For hosting the source code and integrating with CodePipeline.
- GitHub Personal Access Token: Required for CodePipeline to access the repository.
git clone git@github.com:OrderAndCh4oS/sentence-transformer-similarities-code-pipeline-fargate-ecs.git
cd sentence-transformer-similarities-code-pipeline-fargate-ecs
Navigate to the CDK project directory and install the necessary packages.
cd sentence-transformer-similarities-code-pipeline-fargate-ecs
npm install
-
Generate a GitHub Personal Access Token with
repo
andadmin:repo_hook
permissions. -
Store the token in AWS Secrets Manager:
aws secretsmanager create-secret \ --name github/personal_access_token \ --secret-string "<your-personal-access-token>"
Update the cdk.json
file with your GitHub username and repository name if necessary.
Deploy the stack:
cdk deploy --parameters githubUserName=<your-github-username>
- lib/ecs-code-deploy-example-stack.ts: CDK stack defining AWS resources.
- src/app/main.py: FastAPI application code.
- src/app/save_sentence_transformer_model.py: Script to download and save the model.
- src/Dockerfile: Dockerfile for building the application image.
By default, the msmarco-MiniLM-L12-cos-v5
model is used. To change the model:
-
Update
save_sentence_transformer_model.py
:model_name = "your-model-name" model = SentenceTransformer('sentence-transformers/your-model-name') model.save(f"/src/app/{model_name}")
-
Update
main.py
to load the new model:model_name = "your-model-name" model = SentenceTransformer(f'/src/app/{model_name}')
curl -X POST http://<load-balancer-dns-name>/embeddings/create \
-H "Content-Type: application/json" \
-d '{"texts": ["Hello, world!", "Another sentence."]}'
To avoid incurring charges, delete the AWS resources when they're no longer needed:
cdk destroy
Also, remove the ECR repository and any images:
aws ecr delete-repository --repository-name similarity-embeddings-repository --force