OP3

We present object-centric perception, prediction, and planning (OP3), an entity-centric dynamic latent variable framework for model-based reinforcement learning that acquires entity representations from raw visual observations without supervision and uses them to predict and plan.

CoRL 2019 Conference Paper: "Entity Abstraction in Visual Model-Based Reinforcement Learning".

More information is available at the project website.

Table of Contents

Installation
Running Experiments
Using a GPU
Visualizing Results
Launching Jobs with Doodad
Credits

Installation

Copy conf.py to conf_private.py:

cp op3/launchers/conf.py op3/launchers/conf_private.py

Install and use the included Ananconda environment

$ conda env create -f environment/linux-gpu-env.yml
$ source activate op3

These Anaconda environments use MuJoCo 1.5 and gym 0.10.5 which are not needed for training the model but are needed for generating the datasets and running MPC. You'll need to get your own MuJoCo key if you want to use MuJoCo.

Download this version of doodad and add the repo to your pythonpath. Docker and doodad dependencies will only be needed if running on AWS or GCP.

Running Experiments

Datasets and Tasks

Download datasets from this google drive folder to op3/data/datasets/.

Single-Step Block Stacking

The stack_o2p2_60.h5 dataset contains 60,000 before and after images of blocks being dropped. This is the same dataset used in O2P2 "Reasoning About Physical Interactions with Object-Oriented Prediction and Planning."

This dataset can also be generated using the Mujoco environment.

cd op3/envs/blocks/mujoco
python stacking_generating_singlestep.py --num_images NUM_IMAGES

which will by default output the dataset to op3/envs/blocks/rendered/blocks.h5. See the args in the file for more options such as controlling the number of objects in the scene.

Multi-Step Block Stacking

The pickplace_multienv_10k.h5 dataset contains 10,000 trajectories where each trajectory contains five frames of randomly picking and placing blocks.

This dataset can also be generated using the Mujoco environment. Run

cd op3/envs/blocks/mujoco
python block_pick_and_place.py -f OUTPUT_FILENAME --num_sums NUM_SIMS

See the args in the file for more options such as controlling the number of objects in the scene and biasing pick locations to pick up objects.

Real-World Action-Conditioned Video Prediction

The robotic pushing dataset is from https://sites.google.com/berkeley.edu/robotic-interaction-datasets.

Training OP3

To train OP3 run

python exps/train_op3/train_op3.py --variant [stack, pickplace, cloth] --debug 0

where the variants are stack for single step block stacking, pickplace for multistep block stacking, and cloth for the real world evaluation on the robotic pushing dataset. These loads in parameters from op3/exp_variants/variants.py which can also be modified or extended. The preprocessed cloth dataset can be downloaded from here.

Running MPC

To run visual mpc with a trained op3 model, for single-step block stacking run

python exps/stack_exps/mpc_stack.py -m <stack_model_params_file>

and for multi-step block stacking run,

python exps/pickplace_exps/mpc_pickplace.py -m <pickplace_model_params_file>

where the -m argument if the name of the model file trained previously. e in op3/data/saved_models. Pretrained models are provided in the appropiate saved_models directory such as OP3/exps/stack_exps/saved_models.

Using a GPU

You can use a GPU by calling

import op3.torch.pytorch_util as ptu
ptu.set_gpu_mode(True)

before launching the scripts.

If you are using doodad (see below), simply use the use_gpu flag:

run_experiment(..., use_gpu=True)

Visualizing results

During training, the results will be saved to a file called under

LOCAL_LOG_DIR/<exp_prefix>/<foldername>

LOCAL_LOG_DIR is the directory set by op3.launchers.conf.LOCAL_LOG_DIR. Default name is op3/data/logs.
<exp_prefix> is given either to setup_logger.
<foldername> is auto-generated and based off of exp_prefix.

You can visualize results with viskit.

python viskit/viskit/frontend.py LOCAL_LOG_DIR/<exp_prefix>/

This viskit repo also has a few extra nice features, like plotting multiple Y-axis values at once, figure-splitting on multiple keys, and being able to filter hyperparametrs out.

Launching jobs with `doodad`

The run_experiment function makes it easy to run Python code on Amazon Web Services (AWS) or Google Cloud Platform (GCP) by using doodad.

It's as easy as:

from op3.launchers.launcher_util import run_experiment

def function_to_run(variant):
    learning_rate = variant['learning_rate']
    ...

run_experiment(
    function_to_run,
    exp_prefix="my-experiment-name",
    mode='ec2',  # or 'gcp'
    variant={'learning_rate': 1e-3},
)

You will need to set up parameters in conf_private.py (see step one of Installation). This requires some knowledge of AWS and/or GCP, which is beyond the scope of this README. To learn more, more about doodad, go to the repository.

Credits

A lot of the coding infrastructure is based on rlkit.

The Dockerfile is based on the OpenAI mujoco-py Dockerfile.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

OP3

Installation

Running Experiments

Datasets and Tasks

Single-Step Block Stacking

Multi-Step Block Stacking

Real-World Action-Conditioned Video Prediction

Training OP3

Running MPC

Using a GPU

Visualizing results

Launching jobs with `doodad`

Credits

Files

README.md

Latest commit

History

README.md

File metadata and controls

OP3

Installation

Running Experiments

Datasets and Tasks

Single-Step Block Stacking

Multi-Step Block Stacking

Real-World Action-Conditioned Video Prediction

Training OP3

Running MPC

Using a GPU

Visualizing results

Launching jobs with doodad

Credits

Launching jobs with `doodad`