- written by Daniel Skymoen, Edvard Schøyen, Jarand Romestrand, Kristian Vaula Jensen
- Choose an environment - test different algorithms
- Choose an algorithm - test different environments
- Create an environment
- Explore different configurations of an algorithm
- Try a hardcore environment: MineRL, Atari, Doom
We chose to try a hardcore environment, and the environment we ended up with was Super Mario Bros.
Read more about the environment here: https://pypi.org/project/gym-super-mario-bros/
Compare the performence of DDQN and PPO in the Super Mario Bros environment.
Here are the requirements to run the project:
- Python >=3.9, <3.12
Preferred requirements for easier setup and running of the project:
- Make
- Visual Studio C++ build tools
To list all the Make commands use the command under:
make help
The guide for running the project assumes that you have Make installed. If you don't have it, you will find the commands you need in the Makefile
This function creates a virtual environment for all the dependecies required for running the project and installs them.
make setup
Python3 is set to standard and if your computer uses python instead of python3 you can overwrite this with the command below:
make setup PYTHON=python
An .env file is required to use Neptun for logging the training of a model. This must be put inside the src folder. Example of the contents needed for Neptun:
NEPTUNE_API_TOKEN="YOUR_API_KEY"
NEPTUNE_PROJECT_NAME="YOUR_NEPTUN_PROJECT_NAME"
The commands under activates the environment created in the setup section and then runs the python code.
This function activates the environment and then trains the ddqn.
make ddqn
This function activates the environment and then trains the ppo.
make ppo
This function activates the environment and then renders a trained model of ddqn.
make render-ddqn
This function activates the environment and then renders a trained model of ppo.
make render-ppo
You can specify flags for both the ppo and ddqn to log the training to Neptun. You can use these args on both ppo and ddqn. The args can also be combined. To specify flags for logging use the commands under:
Logs the training to make graphs on Neptun
make ddqn args="--log"
Logs the model to Neptun
make ddqn args="--log-model"
Logs the training to make graphs and the model on Neptun
make ddqn args="--log --log-model"
Pyhton3 is set as standard so if your computer uses python instead of python3 you can overwrite this with the command below:
make ddqn PYTHON=python
deactivate
This commands removes the virtual environment created in the setup section. You can also remove this manually.
make clean
Here are some example runs:
Example runs with DDQN:
An example run with PPO:
Here are graphs of the models performance in the environment with 15 000 episodes: