-
Notifications
You must be signed in to change notification settings - Fork 0
Transformer Development Documentation
The transformer model forms the heart of CodeBae. It's architecture is based on GPT, and the model code itself was ported from an offering of CS224N.
Make sure you're using the virtual environment whenever you interact with the model. From the /backend
directory,
source venv/bin/activate
(To create this virtual environment from requirements, follow the Backend Documentation instructions and become acquainted with the main backend codebase)
Once you are in the virtual environment, navigate to ./backend/models/transformer
to interact with the model.
cd models/transformer
python data/get_datasets.py --download --preprocess
This command will download and extract the relevant datasets.
To train the model from scratch, ensure that you have the data downloaded, and then run
python run.py
The model frequently saves parameter checkpoints in ./ckpts/training_checkpoints
. To resume training from one of these checkpoints, you can run
python run.py --resume-from [param file location]
Final trained parameters will be saved in ./ckpts/training_checkpoints
along with the checkpoints each epoch. You can then deploy these parameters as follows:
Once you have the trained parameters, you can simply replace ./backend/models/transformer_params.pt
with your newly trained parameter file. The TransformerModel subclass of BaseModel will dynamically lift the model architecture out of this file and should be agnostic to model architecture.