Skip to content

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

License

Notifications You must be signed in to change notification settings

uci-cbcl/Video-Pre-Training

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video-Pre-Training

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

📄 Read Paper
📣 Blog Post
👾 MineRL Environment (note version 1.0+ required)
🏁 MineRL BASALT Competition

Running models

Install requirements with:

pip install git+https://github.com/minerllabs/minerl@v1.0.0
pip install -r requirements.txt

To run the code, call

python run_agent.py --model [path to .model file] --weights [path to .weight file]

After loading up, you should see a window of the agent playing Minecraft.

Model Zoo

Below are the model files and weights files for various pre-trained Minecraft models. The 1x, 2x and 3x model files correspond to their respective model weights width.

Demonstration Only - Behavioral Cloning

These models are trained on video demonstrations of humans playing Minecraft using behavioral cloning (BC) and are more general than later models which use reinforcement learning (RL) to further optimize the policy. Foundational models are trained across all videos in a single training run while house and early game models refine their respective size foundational model further using either the housebuilding contractor data or early game video sub-set. See the paper linked above for more details.

Foundational Model 📈

Fine-Tuned from House 📈

Fine-Tuned from Early Game 📈

Models With Environment Interactions

These models further refine the above demonstration based models with a reward function targeted at obtaining diamond pickaxes. While less general then the behavioral cloning models, these models have the benefit of interacting with the environment using a reward function and excel at progressing through the tech tree quickly. See the paper for more information on how they were trained and the exact reward schedule.

RL from Foundation 📈

RL from House 📈

RL from Early Game 📈

Contractor Demonstrations Dataset

We are currently working on to release contractor data collected over the course of the project. Links to index files with more information will be linked here as the data is released.

Contribution

This was a large effort by a dedicated team at OpenAI: Bowen Baker, Ilge Akkaya, Peter Zhokhov, Joost Huizinga, Jie Tang, Adrien Ecoffet, Brandon Houghton, Raul Sampedro, Jeff Clune The code here represents a minimal version of our model code which was prepared by Anssi Kanervisto and others so that these models could be used as part of the MineRL BASALT competition.

About

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%