Skip to content

Transformer Development Documentation

Harry Mellsop edited this page May 27, 2021 · 3 revisions

The transformer model forms the heart of CodeBae. It's architecture is based on GPT, and the model code itself was ported from an offering of CS224N.

Environment

Make sure you're using the virtual environment whenever you interact with the model. From the /backend directory,

source venv/bin/activate

(To create this virtual environment from requirements, follow the Backend Documentation instructions and become acquainted with the main backend codebase)

Once you are in the virtual environment, navigate to ./backend/models/transformer to interact with the model.

Obtaining the Data

cd models/transformer
python data/get_datasets.py --download --preprocess

This command will download and extract the relevant datasets.

Training the Model

To train the model from scratch, ensure that you have the data downloaded, and then run

python run.py

The model frequently saves parameter checkpoints in ./ckpts/training_checkpoints. To resume training from one of these checkpoints, you can run

python run.py --resume-from [param file location]

Final trained parameters will be saved in ./ckpts/training_checkpoints along with the checkpoints each epoch. You can then deploy these parameters as follows:

Deploying Trained Parameters

Once you have the trained parameters, you can simply replace ./backend/models/transformer_params.pt with your newly trained parameter file. The TransformerModel subclass of BaseModel will dynamically lift the model architecture out of this file and should be agnostic to model architecture.