Create a conda environment named dmap
by running :
conda env create -f environment.yml
WARNING: the library pybullet_envs
used in this project suffers from an unresolved import error in the file <anaconda_dir>/envs/dmap/lib/python3.7/site-packages/pybullet_envs/robot_locomotors.py
. Please change line number 1 to
from pybullet_envs.robot_bases import XmlBasedRobot, MJCFBasedRobot, URDFBasedRobot
and line number 6 to
from pybullet_envs.robot_bases import BodyPart
to solve the import errors.
Activate the environment : conda activate dmap
To train an agent, run
python main_train.py
By default, this script is set to train a Simple Walker with sigma = 0.1. To change agent and algorithm, modify the config_path parameter, redirecting it to a different .json file in the configs folder (e.g., "walker" -> "hopper", "simple_walker.json" -> "dmap_hopper.json"). Many other parameters, such as the value of sigma, the random seed or the hyperparameters of SAC, can be set in the configuration file.
The script will log to the folder output/training/<current_date>
. To track the progress, run tensorboard --logdir output/training/<current_date>
To train a TCN agent to imitate the environment encoder network of Oracle (RMA training procedure), run:
python main_rma.py
We provide a pretrained Oracle Half Cheetah, sigma = 0.1 and seed = 2, so that the script works out of the box. To change the environment, sigma and random seed, modify the first lines of the script. NB: make sure you have trained an Oracle agent with the same parameters, and that it is in the correct subfolder of data (similarly to the pretrained Half Cheetah model).
To evaluate trained agents, run:
python main_evaluation.py
The script is set to evaluate Oracle Half Cheetah, sigma = 0.1 and seed = 2, which is provided as a pretrained agent. To evaluate the other provided pretrained agent, DMAP Ant sigma = 0.1 and seed = 2, change the script parameters env_name
from "half_cheetah"
to "ant"
and algorithm
from "oracle"
to "dmap"
. It is also possible to produce the ablation results with the same script, in which DMAP is run ignoring the output of the attention encoding network. In this case, the parameter algorithm
must be set to "dmap-ne"
The results of the evaluation of all the trained agents are in the folders data/<agent_name>/performance
. By running the notebook performance_dataset.ipynb
it is possible to generate the pickle
files included in the data
folder. For the analysis of the results, use the notebook performance_analysis.ipynb
, which generates the tables included in the paper.
To create the attention dataset for a single model, run:
python main_attention.py
The default configuration will generate the attention dataset for DMAP Ant, sigma = 0.1, seed = 2 (available as a pretrained model). To run this script with other configurations, first make sure to have trained the corresponding DMAP agent.
TBD.