Trust-Region-Policy-Optimization

My attepmt at a TRPO implementation in pytorch. :)

The implementation is inspired from UC Berkeley's Deep RL Bootcamp's assignments and the following TRPO implementations by ikostrikov , mjacar and the original implementation by John Schulman.

python main.py

All parameters exist in trpo_agent.py

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
distributions.py		distributions.py
helpers.py		helpers.py
main.py		main.py
models.py		models.py
trpo_agent.py		trpo_agent.py

Provide feedback