My personal implementations of several deep reinforcement learning algorithms. Just note however, that my implementations may deviate from the papers slightly - for personal educational reasons. With that said, if you see something architecturally incorrect with one of my implementations, please do let me know.
- DQN
- PPO
- VPG
- SAC
- A2C
- TD3
- DDPG
- TRPO
- Rainbow-DQN (Can't wait for this one tbh)