Update README.md

Farama-Foundation · Jul 22, 2024 · 71000f1 · 71000f1
1 parent d2338dc
commit 71000f1
Showing 1 changed file with 21 additions and 120 deletions.
diff --git a/README.md b/README.md
@@ -2,11 +2,9 @@
 [![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/Farama-Foundation/metaworld/blob/master/LICENSE)
 ![Build Status](https://github.com/Farama-Foundation/Metaworld/workflows/MetaWorld%20CI/badge.svg)
 
-# The current version of Meta-World is a work in progress. If you find any bugs/errors please open an issue.
-
 __Meta-World is an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks.__ We aim to provide task distributions that are sufficiently broad to evaluate meta-RL algorithms' generalization ability to new behaviors.
 
-For more background information, please refer to our [website](https://meta-world.github.io) and the accompanying [conference publication](https://arxiv.org/abs/1910.10897), which **provides baseline results for 8 state-of-the-art meta- and multi-task RL algorithms**.
+For more background information, please refer to our [website](https://metaworld.farama.org/).
 
 __Table of Contents__
 - [Installation](#installation)
@@ -31,7 +29,6 @@ The current roadmap for Meta-World can be found [here](https://github.com/Farama
 ## Installation
 To install everything, run:
 
-
 ```
 pip install git+https://github.com/Farama-Foundation/Metaworld.git@master#egg=metaworld
 ```
@@ -44,11 +41,6 @@ cd Metaworld
 pip install -e .
 ```
 
-For users attempting to reproduce results found in the Meta-World paper please use this command:
-```
-pip install git+https://github.com/Farama-Foundation/Metaworld.git@04be337a12305e393c0caf0cbf5ec7755c7c8feb
-```
-
 ## Using the benchmark
 Here is a list of benchmark environments for meta-RL (ML*) and multi-task-RL (MT*):
 * [__ML1__](https://meta-world.github.io/figures/ml1.gif) is a meta-RL benchmark environment which tests few-shot adaptation to goal variation within single task. You can choose to test variation within any of [50 tasks](https://meta-world.github.io/figures/ml45-1080p.gif) for this benchmark.
@@ -60,15 +52,6 @@ Here is a list of benchmark environments for meta-RL (ML*) and multi-task-RL (MT
 ### Basics
 We provide a `Benchmark` API, that allows constructing environments following the [`gymnasium.Env`](https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/core.py#L21) interface.
 
-To use a `Benchmark`, first construct it (this samples the tasks allowed for one run of an algorithm on the benchmark).
-Then, construct at least one instance of each environment listed in `benchmark.train_classes` and `benchmark.test_classes`.
-For each of those environments, a task must be assigned to it using
-`env.set_task(task)` from `benchmark.train_tasks` and `benchmark.test_tasks`,
-respectively.
-`Tasks` can only be assigned to environments which have a key in
-`benchmark.train_classes` or `benchmark.test_classes` matching `task.env_name`.
-Please see the sections [Running ML1, MT1](#running-ml1-or-mt1) and [Running ML10, ML45, MT10, MT50](#running-a-benchmark)
-for more details.
 
 You may wish to only access individual environments used in the Metaworld benchmark for your research. See the
 [Accessing Single Goal Environments](#accessing-single-goal-environments) for more details.
@@ -81,146 +64,64 @@ For example, for the ML1 benchmark environment with the 'pick-place-v2' environm
 import metaworld
 
 SEED = 0  # some seed number here
-benchmark = metaworld.ML1('pick-place-v2', seed=SEED)
+env = gym.make('ML-pick-place-v2', seed=SEED)
 ```
 
 ### Running ML1 or MT1
 ```python
+import gymnasium as gym
 import metaworld
 import random
 
 print(metaworld.ML1.ENV_NAMES)  # Check out the available environments
 
-ml1 = metaworld.ML1('pick-place-v2') # Construct the benchmark, sampling tasks
+env = gym.make('ML-pick-place-train', seed=SEED)
 
-env = ml1.train_classes['pick-place-v2']()  # Create an environment with task `pick_place`
-task = random.choice(ml1.train_tasks)
-env.set_task(task)  # Set task
-
-obs = env.reset()  # Reset environment
+obs, info = env.reset()  # Reset environment
 a = env.action_space.sample()  # Sample an action
-obs, reward, done, info = env.step(a)  # Step the environment with the sampled random action
+obs, reward, terminate, truncate, info = env.step(a)  # Step the environment with the sampled random action
 ```
 __MT1__ can be run the same way except that it does not contain any `test_tasks`
 ### Running a benchmark
 Create an environment with train tasks (ML10, MT10, ML45, or MT50):
 ```python
+import gymnasium as gym
 import metaworld
 import random
 
-ml10 = metaworld.ML10() # Construct the benchmark, sampling tasks
+train_envs = gym.make('ML10-train', seed=SEED)
 
-training_envs = []
-for name, env_cls in ml10.train_classes.items():
-  env = env_cls()
-  task = random.choice([task for task in ml10.train_tasks
-                        if task.env_name == name])
-  env.set_task(task)
-  training_envs.append(env)
+obs, info = train_envs.reset()  # Reset environment
+a = train_envs.action_space.sample()  # Sample an action
 
-for env in training_envs:
-  obs = env.reset()  # Reset environment
-  a = env.action_space.sample()  # Sample an action
-  obs, reward, done, info = env.step(a)  # Step the environment with the sampled random action
+obs, reward, terminate, truncate, info = train_envs.step(a)  # Step all environments with the sampled random actions
 ```
 Create an environment with test tasks (this only works for ML10 and ML45, since MT10 and MT50 don't have a separate set of test tasks):
 ```python
+import gymnasium as gym
 import metaworld
 import random
 
-ml10 = metaworld.ML10() # Construct the benchmark, sampling tasks
-
-testing_envs = []
-for name, env_cls in ml10.test_classes.items():
-  env = env_cls()
-  task = random.choice([task for task in ml10.test_tasks
-                        if task.env_name == name])
-  env.set_task(task)
-  testing_envs.append(env)
-
-for env in testing_envs:
-  obs = env.reset()  # Reset environment
-  a = env.action_space.sample()  # Sample an action
-  obs, reward, done, info = env.step(a)  # Step the environment with the sampled random action
-```
-
-## Accessing Single Goal Environments
-You may wish to only access individual environments used in the Meta-World benchmark for your research.
-We provide constructors for creating environments where the goal has been hidden (by zeroing out the goal in
-the observation) and environments where the goal is observable. They are called GoalHidden and GoalObservable
-environments respectively.
+test_envs = gym.make('ML10-test', seed=SEED)
 
-You can access them in the following way:
-```python
-from metaworld.envs import (ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE,
-                            ALL_V2_ENVIRONMENTS_GOAL_HIDDEN)
-                            # these are ordered dicts where the key : value
-                            # is env_name : env_constructor
-
-import numpy as np
-
-door_open_goal_observable_cls = ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE["door-open-v2-goal-observable"]
-door_open_goal_hidden_cls = ALL_V2_ENVIRONMENTS_GOAL_HIDDEN["door-open-v2-goal-hidden"]
-
-env = door_open_goal_hidden_cls()
-env.reset()  # Reset environment
-a = env.action_space.sample()  # Sample an action
-obs, reward, done, info = env.step(a)  # Step the environment with the sampled random action
-assert (obs[-3:] == np.zeros(3)).all() # goal will be zeroed out because env is HiddenGoal
-
-# You can choose to initialize the random seed of the environment.
-# The state of your rng will remain unaffected after the environment is constructed.
-env1 = door_open_goal_observable_cls(seed=5)
-env2 = door_open_goal_observable_cls(seed=5)
-
-env1.reset()  # Reset environment
-env2.reset()
-a1 = env1.action_space.sample()  # Sample an action
-a2 = env2.action_space.sample()
-next_obs1, _, _, _ = env1.step(a1)  # Step the environment with the sampled random action
-
-next_obs2, _, _, _ = env2.step(a2)
-assert (next_obs1[-3:] == next_obs2[-3:]).all() # 2 envs initialized with the same seed will have the same goal
-assert not (next_obs2[-3:] == np.zeros(3)).all()   # The env's are goal observable, meaning the goal is not zero'd out
-
-env3 = door_open_goal_observable_cls(seed=10)  # Construct an environment with a different seed
-env1.reset()  # Reset environment
-env3.reset()
-a1 = env1.action_space.sample()  # Sample an action
-a3 = env3.action_space.sample()
-next_obs1, _, _, _ = env1.step(a1)  # Step the environment with the sampled random action
-next_obs3, _, _, _ = env3.step(a3)
-
-assert not (next_obs1[-3:] == next_obs3[-3:]).all() # 2 envs initialized with different seeds will have different goals
-assert not (next_obs1[-3:] == np.zeros(3)).all()   # The env's are goal observable, meaning the goal is not zero'd out
+obs, info = test_envs.reset()  # Reset environment
+a = test_envs.action_space.sample()  # Sample an action
 
+obs, reward, terminate, truncate, info = test_envs.step(a)  # Step all environments with the sampled random actions
 ```
 
 ## Citing Meta-World
-If you use Meta-World for academic research, please kindly cite our CoRL 2019 paper the using following BibTeX entry.
-
-```
-@inproceedings{yu2019meta,
-  title={Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning},
-  author={Tianhe Yu and Deirdre Quillen and Zhanpeng He and Ryan Julian and Karol Hausman and Chelsea Finn and Sergey Levine},
-  booktitle={Conference on Robot Learning (CoRL)},
-  year={2019}
-  eprint={1910.10897},
-  archivePrefix={arXiv},
-  primaryClass={cs.LG}
-  url={https://arxiv.org/abs/1910.10897}
-}
-```
+In progress ... 
 
 ## Accompanying Baselines
-If you're looking for implementations of the baselines algorithms used in the Meta-World conference publication, please look at our sister directory, [Garage](https://github.com/rlworkgroup/garage).
-
-Note that these aren't the exact same baselines that were used in the original conference publication, however they are true to the original baselines.
+In progress ... 
 
 ## Become a Contributor
 We welcome all contributions to Meta-World. Please refer to the [contributor's guide](https://github.com/Farama-Foundation/Metaworld/blob/master/CONTRIBUTING.md) for how to prepare your contributions.
 
 ## Acknowledgements
-Meta-World is a work by [Tianhe Yu (Stanford University)](https://cs.stanford.edu/~tianheyu/), [Deirdre Quillen (UC Berkeley)](https://scholar.google.com/citations?user=eDQsOFMAAAAJ&hl=en), [Zhanpeng He (Columbia University)](https://zhanpenghe.github.io), [Ryan Julian (University of Southern California)](https://ryanjulian.me), [Karol Hausman (Google AI)](https://karolhausman.github.io),  [Chelsea Finn (Stanford University)](https://ai.stanford.edu/~cbfinn/) and [Sergey Levine (UC Berkeley)](https://people.eecs.berkeley.edu/~svlevine/).
+Meta-World is now maintained by Farama-Foundation. You can interact with our community and Meta-World maintainers in our [Discord server](https://discord.gg/PfR7a79FpQ)
+
+Meta-World is a work created by [Tianhe Yu (Stanford University)](https://cs.stanford.edu/~tianheyu/), [Deirdre Quillen (UC Berkeley)](https://scholar.google.com/citations?user=eDQsOFMAAAAJ&hl=en), [Zhanpeng He (Columbia University)](https://zhanpenghe.github.io), [Ryan Julian (University of Southern California)](https://ryanjulian.me), [Karol Hausman (Google AI)](https://karolhausman.github.io),  [Chelsea Finn (Stanford University)](https://ai.stanford.edu/~cbfinn/) and [Sergey Levine (UC Berkeley)](https://people.eecs.berkeley.edu/~svlevine/).
 
 The code for Meta-World was originally based on [multiworld](https://github.com/vitchyr/multiworld), which is developed by [Vitchyr H. Pong](https://people.eecs.berkeley.edu/~vitchyr/), [Murtaza Dalal](https://github.com/mdalal2020), [Ashvin Nair](http://ashvin.me/), [Shikhar Bahl](https://shikharbahl.github.io), [Steven Lin](https://github.com/stevenlin1111), [Soroush Nasiriany](http://snasiriany.me/), [Kristian Hartikainen](https://hartikainen.github.io/) and [Coline Devin](https://github.com/cdevin). The Meta-World authors are grateful for their efforts on providing such a great framework as a foundation of our work. We also would like to thank Russell Mendonca for his work on reward functions for some of the environments.