- Sequence to sequence models (seq2seq) for language translation with Pytorch
- bidirectional encoder-decoder LSTM
- notebook at
src/en-fr_rnn.ipynb
- notebook at
- bidirectional encoder-decoder LSTM
- English to French translation using the Anki dataset
- Model deployment on AWS Sagemaker for inference using bentoml
Steps to deploy the trained model on AWS Sagemaker for inference. The final result will include the model being accessible for inference at a specific API endpoint on AWS.
- Install Terraform
- Install Docker
- Install
requirements.txt
pip install -r requirements.txt
-
Run the
src/en-fr_rnn.ipynb
notebook to train the model and save its weights atsrc/weights/en-fr_rnn_lstm_512.pt
. Modifysrc/create_model.py
with any changes to the final model's architecture or the paths to its weights, if necessary.Note: pretrained weights are currently not included in this repo due to file size limits imposed by Github
-
Create the bentoml model
cd src
py create_model.py
- Build the bento
bentoml build
- Install the aws-sagemaker bentoctl operator
bentoctl operator install aws-sagemaker
-
Initialize a bentoctl deployment. You may want to delete files including
src/deployment_config.yaml
andsrc/main.tf
before doing so (so that they can be overwritten without any conflict).Note: If other operators are installed, running
bentoctl init
may not work as expected, specifically for Windows hosts. Remove all existing bentoctl operators by deleting the bentoctl folder at~/bentoctl
. Then install the operator you'd like to work with.
bentoctl init
- Run
bentoctl build
with the deployment config and the built bento. You can view existing bentos with the commandbentoml list
.
bentoctl build -f deployment_config.yaml -b <YOUR_BENTO_NAME:TAG_HERE>
- Initialize Terraform and apply the terraform config/plan. Use
bentoctl.tfvars
, which should have been created when runningbentoctl init
, for the var. file.
terraform init
terraform apply --var-file=bentoctl.tfvars
Once done, destroy all resources created (including the AWS ecr repository) with:
bentoctl destroy
- Implement attention with the RNNs, or even a transformer
- Try using a larger dataset, such as WMT 2014
- Test and implement different methods to reduce text degeneration
- Pytorch tutorial by Sean Robertson introducing seq2seq networks: https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html
- Youtube video by @mildlyoverfitted explaining the bentoml deployment process: https://youtu.be/Zci_D4az9FU