This is the repository for the project: Deep Reinforcement Learning for the cooperative card game Hanabi, by Bram Grooten and his master thesis supervisors: Jelle Wemmenhove, Maurice Poot, and Jim Portegies. Read the corresponding paper here. Our code builds upon the Hanabi Learning Environment provided by DeepMind.
drl_for_hanabi/ # directory that we have added
algorithms/ # the main algorithms: SPG, VPG, PPO
experiments/ # files to train the agents with different settings
extra/ # some extra algorithms, not used in the paper
decentralized/ # files that make decentralized test runs possible
# (all training is done in a centralized manner)
rule_based/ # some rule-based agents (no learning involved)
supervised/ # imitation learning approach
tabular/ # a Q-learning agent
utils/ # often used utilities, like action & state representations
hanabi_learning_environment/ # the actual HLE
rl_env.py # the main reinforcement learning environment
rl_env_adjusted.py # some adjustments added by us
.gitignore # for git
Hanabi_paper_with_appendix.pdf # the paper corresponding to this project
README.md # this file
requirements.txt # packages that need to be installed
clean_all.sh # files from DeepMind's HLE
CMakeLists.txt
LICENSE
pyproject.toml
setup.py
You need some version of Linux to be able to run this. A Windows Subsystem for Linux (WSL) works as well. See an installation guide for WSL here. To be able to use WSL with PyCharm on Windows, see this useful guide.
- Open your Linux or WSL terminal.
git clone
this repository to your local computer.- Create a new venv to accompany this repo, and make sure it's activated. We use Python 3.8.10, so make sure you have that in your venv. (See instructions specific to WSL and PyCharm here.)
- This repo is a copy of the Hanabi Learning Environment (HLE)
with added files and folders. In your Linux or WSL terminal, run
pip install -e .
from within the top directory (with thesetup.py
file in it). The-e
makes sure that you can adjust the HLE package a bit if you wish, like we did as well. As said on the HLE page, you might need to run these first:
sudo apt-get install g++ # if you don't already have a CXX compiler
sudo apt-get install python-pip # if you don't already have pip
- Now run
pip install -r requirements.txt
from within the top directory (with therequirements.txt
file in it), to install the packages that we use on top of the HLE.
- Go to the
drl_for_hanabi/extra/decentralized/
directory. - Run the file
hanabi_runner.py
.
At the bottom of that file you can choose which agents you want to play with.
Select the HumanPlayer()
as one of the agents if you want to interact with them.
- Go to the
drl_for_hanabi/algorithms/
directory. - Run one of the files
ppo.py
,vpg.py
orspg.py
.
At the top of each file you can change how long the algorithm trains (RUNTIME
, in hours)
or for how many epochs (EPOCHS
, set RUNTIME
≤ 0). You can change the hyperparameters as well,
such as the learning rate LR
or the discount factor GAMMA
.
The network size can easily be adjusted: if you want 2 hidden layers with 96, 64 nodes respectively,
then set HIDDEN_LAYERS = [96, 64]
.