GitHub - het-25/vagram_ensembled: [ICLR 22] Value Gradient weighted Model-Based Reinforcement Learning.

Value Gradient weighted Model-Based Reinforcement Learning.

This is the official code for VAGRAM published at ICLR 2022.
The code framework builds on MBRL-lib

Experiments

To run the experiments presented in the paper, install the required libraries found in requirements.txt and use the vagram/mbrl/examples/main.py script provided by mbrl-lib.

The exact settings for the hopper experiments can be found in vagram/scripts:

Distraction (2nd cmd parameter sets the number of distracting dimensions):

python3 -m mbrl.examples.main \
	seed=$1 \
	algorithm=mbpo \
	overrides=mbpo_hopper_distraction \
	overrides.num_steps=500000 \
	overrides.model_batch_size=1024 \
	overrides.distraction_dimensions=$2

Reduced model size (num_layers sets the model size):

python3 -m mbrl.examples.main \
	seed=$RANDOM \
	algorithm=mbpo \
	overrides=mbpo_hopper \
	dynamics_model.model.num_layers=3 \
	dynamics_model.model.hid_size=64 \
	overrides.model_batch_size=1024

To use MSE/MLE instead of VaGraM, run:

python3 -m mbrl.examples.main \
	seed=$1 \
	algorithm=mbpo \
	overrides=mbpo_hopper_distraction \
	overrides.num_steps=500000 \
	overrides.model_batch_size=256 \
	dynamics_model=gaussian_mlp_ensemble \
	overrides.distraction_dimensions=$2

Using VaGraM

The core implementation of the VaGraM algorithm can be found in vagram/mbrl/models/vaml_mlp.py. The code offers three variants, one for IterVAML, on for the unbounded VaGraM objective and finally the bounded VaGraM objective used in the paper. THe default configuration used in all experiments can be found in vagram/mbrl/examples/conf/dynamics_model/vaml_ensemble.yaml.

In addition to the implementation details in the paper, we introduced a cache for the computed value function gradients. This does not change any detail of the optimization, but saves gradients of the state samples until the value function is updated for faster computation.

Citing

If you use this project in your research, please cite:

@inproceedings{voelcker2022vagram,
  title={{Value Gradient weighted Model-Based Reinforcement Learning}}, 
  author={Claas A Voelcker and Victor Liao and Animesh Garg and Amir-massoud Farahmand},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2022},
  url={https://openreview.net/forum?id=4-D6CZkRXxI}
}

License

VaGRAM is released under the MIT license. See LICENSE for additional details about it.

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
.github		.github
docs		docs
mbrl		mbrl
notebooks		notebooks
requirements		requirements
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproyect.toml		pyproyect.toml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Value Gradient weighted Model-Based Reinforcement Learning.

Experiments

Using VaGraM

Citing

License

About

Releases

Packages

Languages

License

het-25/vagram_ensembled

Folders and files

Latest commit

History

Repository files navigation

Value Gradient weighted Model-Based Reinforcement Learning.

Experiments

Using VaGraM

Citing

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages