MountainCar-v0 with Q-Learning and SARSA

This project contains the code to train an agent to solve the OpenAI Gym Mountain Car environment with Q-Learning and SARSA. It also contains pre-trained agents using both algorithms.

The Mountain Car Environment

The environment is two-dimensional and it consists of a car between two hills. The goal of the car is to reach a flag at the top of the hill on the right. The hills are too steep for the car to scale just by moving in the same direction, it has to go back and fourth to build up enough momentum to drive up.

Observation Space:

The are two variables that determine the current state of the environment.

The car position on the track, from -1.2 to 0.6
The car velocity, from -0.07 to 0.07. Negative for left, and positive for right.

Actions:

The car can take one of three different actions:

Accelerate to the left
Don't accelerate
Accelerate to the right.

Reward:

At each step, the car receives a reward based on the state it reached after that action:

Reward of 0 is awarded if the agent reached the flag (position = 0.5) on top of the mountain.
Reward of -1 is awarded if the position of the agent is less than 0.5.

Starting State:

The car starts between the two mountains, in a random position between -0.6 and -0.4, with velocity equal to 0.

Episode Termination:

The episode ends when the car reaches the flag (position > 0.5). The episode may also terminate when it reaches the maximum number of steps (The original value is 200. Here I used 1000 for training).

Q-Learning

You can run the following line to render a pre-trained agent with Q-learning. It will run 10 for episodes.

python run_qlearning_agent.py

The agent was trained for 100000 episodes. Its Q-values are saved in the file 'pre-trained-Q-Learning.pkl', which can be loaded with the function load_obj() from the auxFunctions.py file.

Agent Trained using Q-Learning

To train a new agent you can run the following line:

python train_qlearning.py

It will train for 50000 episodes, with the following hyperparameters:

learning rate (alpha) = 0.1
the temporal difference discount factor (gamma) = 0.9 After training, it will create a fille called 'Q-table-Q-Learning'. Replace the 'pre-trained-Q-Learning' string inputted to the load_obj() function in run_qlearning_agent.py with 'Q-table-Q-Learning' so you can see the agent you trained.

SARSA

You can run the following line to render a pre-trained agent with SARSA. It will run for 10 episodes.

python run_sarsa_agent.py

The agent was trained for 100000 episodes. Its Q-values are saved in the file 'pre-trained-SARSA.pkl', which can be loaded with the function load_obj() from the auxFunctions.py file.

Agent Trained using SARSA
To train a new agent you can run the following line:

python train_sarsa.py

It will train for 50000 episodes, with the following hyperparameters:

learning rate (alpha) = 0.1
the temporal difference discount factor (gamma) = 0.9 After training, it will create a fille called 'Q-table-SARSA'. Replace the 'pre-trained-SARSA' string inputted to the load_obj() function in run_sarsa_agent.py with 'Q-table-SARSA' so you can see the agent you trained.

Comparison

Q-learning and SARSA have a very similar performance in terms of how much they learn at each episode, which is expected given that both are utilizing the same hyperparameters and policy. SARSA seems to be a little more consistent, given that Q-Learning still obtains rewards that are less than -200 sometimesafter training for more than 50000 episodes.

References

OpenAI. (n.d.). MountainCar-v0. Retrieved from https://gym.openai.com/envs/MountainCar-v0/
Reinforcement Learning in the OpenAI Gym (Tutorial) - SARSA. (2018, August 8). [Video]. YouTube. https://www.youtube.com/watch?v=P9XezMuPfLE

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
MountainCarEnvironment.PNG		MountainCarEnvironment.PNG
Q-learning_vs_Sarsa.jpg		Q-learning_vs_Sarsa.jpg
Q-learning_vs_Sarsa.png		Q-learning_vs_Sarsa.png
QlearningAgent.gif		QlearningAgent.gif
README.md		README.md
SARSAAgent.gif		SARSAAgent.gif
auxFunctions.py		auxFunctions.py
pre-trained-Q-Learning.pkl		pre-trained-Q-Learning.pkl
pre-trained-SARSA.pkl		pre-trained-SARSA.pkl
run_qlearning_agent.py		run_qlearning_agent.py
run_random_agent.py		run_random_agent.py
run_sarsa_agent.py		run_sarsa_agent.py
train_qlearning.py		train_qlearning.py
train_sarsa.py		train_sarsa.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MountainCar-v0 with Q-Learning and SARSA

The Mountain Car Environment

Observation Space:

Actions:

Reward:

Starting State:

Episode Termination:

Q-Learning

SARSA

Comparison

References

About

Releases

Packages

Languages

viniciusenari/Q-Learning-and-SARSA-Mountain-Car-v0

Folders and files

Latest commit

History

Repository files navigation

MountainCar-v0 with Q-Learning and SARSA

The Mountain Car Environment

Observation Space:

Actions:

Reward:

Starting State:

Episode Termination:

Q-Learning

SARSA

Comparison

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages