GitHub - ethanthoma/decision-transformer: Implementation of the decision tranformer paper in tinygrad

Decision Transformer in Tinygrad

This codebase is an implementation of the "Decision Transformer: Reinforcement Learning via Sequence Modeling" paper in tinygrad.

The flake.nix uses poetry2nix for building and running the code. The data fecthing must be done manually.

Running

To run the code, you will have to clone it:

git clone https://github.com/ethanthoma/decision-transformer.git
cd decision-transformer

Next, you will need to download the data. The code only takes a subset from one split (of five) per each game from the Batch RL DQN Replay dataset. There is a simple script that will download all the splits for Pong:

./download_pong_data.sh

Finally, you can train the model via the train command:

nix run . -- train

And test your model via

nix run . -- test

Both the train and test commands have a lot of arguments you can set. Use -h with the command to see all available flags.

Performance

Data Loading

I did not do any timings or benchmarks for the code. However, I did originally use the data loader code from Google Research's Batch RL codebase. This used TensorFlow and dopamine-rl. Based on the logs after running it for 2 days, it would have taken a total of 7 days to generate the training set that was used in the original paper.

I remade the data loader using only numpy. It also uses multiple threads to load the 50 checkpoints. Data loading is now done in one day instead of 7.

Model

The model code is very similar to the PyTorch implementation from the original codebase. Everything is reimplemented in tinygrad.

The model uses about 31GBs of RAM during training and 0.5GBs during testing.

Modifications

TBD

Results

TBD

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
autorom		autorom
scripts		scripts
.envrc		.envrc
.gitignore		.gitignore
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decision Transformer in Tinygrad

Running

Performance

Data Loading

Model

Modifications

Results

About

Languages

ethanthoma/decision-transformer

Folders and files

Latest commit

History

Repository files navigation

Decision Transformer in Tinygrad

Running

Performance

Data Loading

Model

Modifications

Results

About

Topics

Resources

Stars

Watchers

Forks

Languages