Skip to content

Official Implementations of "Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice"

Notifications You must be signed in to change notification settings

matsuolab/Deep-Variance-Weighting-MinAtar

Repository files navigation

Official Implementation of Deep Variance Weighting (DVW) [Experiments in Section 7.2.2]

This repository is the official implementation of Deep Variance Weighting for the MinAtar experiments in Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice.

“”

“”

“”

“”

“”

Requirements

  • Step 1: Install dependencies
# make sure you are in Variance-Weighted-MDVI/Deep-Variance-Weighting-MinAtar
poetry install

# Install MinAtar in submodule
poetry shell
git submodule update --init && cd MinAtar
pip install -e .
  • Step 2: Login to wandb (for ease of visualization and plotting)
wandb login # only required for the first time

You can test if everything works by:

# If you have something wrong with GPU, please replace "--device cuda" with "--device cpu"

# Weighted M-DQN
poetry run python cleanrl/dqn_minatar.py --total-timesteps 50000 --env-id breakout --track --wandb-project-name minatar-test --exp-name Weight-Net-M-DQN --weight-type variance-net --device cuda
# M-DQN
poetry run python cleanrl/dqn_minatar.py --total-timesteps 50000 --env-id breakout --track --wandb-project-name minatar-test --exp-name M-DQN --weight-type none --device cuda

# Weighted DQN
poetry run python cleanrl/dqn_minatar.py --total-timesteps 50000 --env-id breakout --track --wandb-project-name minatar-test --exp-name Weight-Net-M-DQN --weight-type variance-net --kl-coef 0.0 --ent-coef 0.0 --device cuda
# DQN
poetry run python cleanrl/dqn_minatar.py --total-timesteps 50000 --env-id breakout --track --wandb-project-name minatar-test --exp-name DQN --weight-type none --kl-coef 0.0 --ent-coef 0.0 --device cuda

Run MinAtar Experiments

Run bash run_minatar.bash

Plot results

Run all the cells in minatar-results/result-plotter.ipynb. The figures will be saved in minatar-results directory.

(Optional) Classic Control

If you are interested in other environments, try the following for classic controls:

# If you have something wrong with GPU, please replace "--device cuda" with "--device cpu"

# Weighted M-DQN
poetry run python cleanrl/dqn.py --total-timesteps 50000 --env-id CartPole-v1 --track --wandb-project-name classic-control-test --exp-name Weight-Net-M-DQN --weight-type variance-net  --device cuda
# M-DQN
poetry run python cleanrl/dqn.py --total-timesteps 50000 --env-id CartPole-v1 --track --wandb-project-name classic-control-test --exp-name M-DQN --weight-type none --device cuda 

# Weighted DQN
poetry run python cleanrl/dqn.py --total-timesteps 50000 --env-id CartPole-v1 --track --wandb-project-name classic-control-test --exp-name Weight-Net-M-DQN --weight-type variance-net --kl-coef 0.0 --ent-coef 0.0 --device cuda
# DQN
poetry run python cleanrl/dqn.py --total-timesteps 50000 --env-id CartPole-v1 --track --wandb-project-name classic-control-test --exp-name DQN --weight-type none --kl-coef 0.0 --ent-coef 0.0 --device cuda 

About

Official Implementations of "Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published