This tutorial assumes a completely fresh installation of Ubuntu. If dependencies
are already installed then simply run the "Activate Virtual Environment",
"Install the Given Requirements", and "Run the Relevant Python
File" steps.
This repo is inspired by Moritz Schneider's implementation of MAML TRPO
schneimo/maml-rl-tf2 (TensorFlow) as
well as the rlworkgroup's implementation of MAML PPO
rlworkgroup/garage (PyTorch /
TensorFlow).
sudo apt update
sudo apt install git
From your GitHub account, go to Settings → Developer Settings → Personal Access
Tokens → Tokens (Classic) → Generate New Token → Generate New Token (Classic) → Add a relevant "Note" → Select Scope of "Repo" → Fill out the Remainder of the Form → Generate Token → Copy the Generated Token, it will be something like
ghp_randomly_generated_personal_access_token
git clone https://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@github.com/ChinemeremChigbo/maml-ppo.git
cd maml-ppo/
sudo apt install curl build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev curl libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev
curl https://pyenv.run | bash
printf "%s\n" '' 'export PATH="$HOME/.pyenv/bin:$PATH"' 'eval "$(pyenv init -)"' 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc
source ~/.bashrc
pyenv install 3.7.16
sudo apt install ffmpeg patchelf unzip libosmesa6-dev libgl1-mesa-glx libglfw3
wget https://www.roboti.us/download/mjpro150_linux.zip
wget https://www.roboti.us/file/mjkey.txt
unzip mjpro150_linux.zip
rm mjpro150_linux.zip
mkdir $HOME/.mujoco
mv mjpro150 $HOME/.mujoco
mv mjkey.txt $HOME/.mujoco
printf "%s\n" '' 'export LD_LIBRARY_PATH=$HOME/.mujoco/mjpro150/bin' 'export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python' >> ~/.bashrc
source ~/.bashrc
pyenv local 3.7.16
python3 -m venv env3.7
source env3.7/bin/activate
pip install wheel==0.40.0
python3 -m pip install -r requirements.txt
python3 main_trpo.py --env-name HalfCheetahDir-v1 --num-workers 20 --fast-lr 0.1 --max-kl 0.01 --fast-batch-size 5 --meta-batch-size 10 --num-layers 2 --hidden-size 100 --num-batches 1 --gamma 0.99 --tau 1.0 --cg-damping 1e-5 --ls-max-steps 10 --save-iters 1
python3 experiments.py
python3 main_maml_ppo.py --epochs=1 --episodes_per_task=1
python3 main_cav_ppo.py --epochs=1 --episodes_per_task=1
python3 main_cav_maml_ppo.py --epochs=1
Note that you can replace 2 with whichever CAV test is required
python3 test_2CAV_BFoptimal_Kaige.py
@misc{garage,
author = {The garage contributors},
title = {Garage: A toolkit for reproducible reinforcement learning research},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/rlworkgroup/garage}},
commit = {be070842071f736eb24f28e4b902a9f144f5c97b}
}
@article{DBLP:journals/corr/FinnAL17,
author = {Chelsea Finn and Pieter Abbeel and Sergey Levine},
title = {Model-{A}gnostic {M}eta-{L}earning for {F}ast {A}daptation of {D}eep {N}etworks},
journal = {International Conference on Machine Learning (ICML)},
year = {2017},
url = {http://arxiv.org/abs/1703.03400}
}