Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning and Extensions

PyTorch Implementation of Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning with additional extensions like PER, Noisy layer and N-step bootstrapping. Creating a new Rainbow-DQN version. This implementation allows it also to run and train on several environments in parallel!

Implementations

Baseline IQN Notebook
Script Version with all extensions: IQN The IQN Baseline in this repository is already a Double IQN version with target networks!

Extensions

Dueling IQN
Noisy layer
N-step bootstrapping
Munchausen RL
Parallel environments for faster training (wall clock time). For CartPole-v0 3 worker reduced training time to 1/3!

Train

With the script version it is possible to train on simple environments like CartPole-v0 and LunarLander-v2 or on Atari games with image inputs!

To run the script version: python run.py -info iqn_run1

To run the script version on the Atari game Pong: python run.py -env PongNoFrameskip-v4 -info iqn_pong1

Other hyperparameter and possible inputs

To see the options: python run.py -h

-agent, choices=["iqn","iqn+per","noisy_iqn","noisy_iqn+per","dueling","dueling+per", "noisy_dueling","noisy_dueling+per"], Specify which type of IQN agent you want to train, default is IQN - baseline!
-env,  Name of the Environment, default = BreakoutNoFrameskip-v4
-frames, Number of frames to train, default = 10 mio
-eval_every, Evaluate every x frames, default = 250000
-eval_runs, Number of evaluation runs, default = 2
-seed, Random seed to replicate training runs, default = 1
-munchausen, choices=[0,1], Use Munchausen RL loss for training if set to 1 (True), default = 0
-bs, --batch_size, Batch size for updating the DQN, default = 8
-layer_size, Size of the hidden layer, default=512
-n_step, Multistep IQN, default = 1
-N, Number of quantiles, default = 8
-m, --memory_size, Replay memory size, default = 1e5
-lr, Learning rate, default = 2.5e-4
-g, --gamma, Discount factor gamma, default = 0.99
-t, --tau, Soft update parameter tat, default = 1e-3
-eps_frames, Linear annealed frames for Epsilon, default = 1 mio
-min_eps, Final epsilon greedy value, default = 0.01
-info, Name of the training run
-w, --worker, Number of parallel environments. Batch size increases proportional to number of worker. Not recommended to have more than 4 worker, default = 1
-save_model, choices=[0,1]  Specify if the trained network shall be saved or not, default is 0 - not saved!

Observe training results

tensorboard --logdir=runs

Dependencies

Trained and tested on:

Python 3.6 
PyTorch 1.4.0  
Numpy 1.15.2 
gym 0.10.11

CartPole Results

IQN and Extensions (default hyperparameter):

Dueling IQN and Extensions (default hyperparameter):

Atari Results

IQN and M-IQN comparison (only trained for 500000 frames ~ 140 min).

Hyperparameter:

frames 500000
eps_frames 75000
min_eps 0.025
eval_every 10000
lr 1e-4
t 5e-3
m 15000
N 32

Performance after 10 mio frames, score 258

ToDo:

Comparison plot for n-step bootstrapping (n-step bootstrapping with n=3 seems to give a strong boost in learning compared to one step bootstrapping, plots will follow)
Performance plot for Pong compared with Rainbow
adding Munchausen RL ☑

Help and issues:

Im open for feedback, found bugs, improvements or anything. Just leave me a message or contact me.

Paper references:

Author

Sebastian Dittert

Feel free to use this code for your own projects or research. For citation:

@misc{IQN and Extensions,
  author = {Dittert, Sebastian},
  title = {Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning and Extensions},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/BY571/IQN}},
}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
imgs		imgs
IQN-DQN.ipynb		IQN-DQN.ipynb
LICENSE		LICENSE
MultiPro.py		MultiPro.py
README.md		README.md
ReplayBuffers.py		ReplayBuffers.py
agent.py		agent.py
model.py		model.py
plot.ipynb		plot.ipynb
run.py		run.py
wrapper.py		wrapper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning and Extensions

Implementations

Extensions

Train

Other hyperparameter and possible inputs

Observe training results

Dependencies

CartPole Results

Atari Results

ToDo:

Help and issues:

Paper references:

Author

About

Releases

Packages

Languages

License

BY571/IQN-and-Extensions

Folders and files

Latest commit

History

Repository files navigation

Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning and Extensions

Implementations

Extensions

Train

Other hyperparameter and possible inputs

Observe training results

Dependencies

CartPole Results

Atari Results

ToDo:

Help and issues:

Paper references:

Author

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages