Reinforcement Learning in a Maze Environment using Q-Learning 🤖

Reinforcement Learning (RL) is a machine learning paradigm where an agent learns to make decisions by interacting with an environment. The agent receives feedback in the form of rewards or punishments based on the actions it takes. The goal of RL is for the agent to learn a policy, a strategy that maps states to actions, in order to maximize the cumulative reward over time.

In this project, we focus on RL in a maze environment using Q-Learning, a model-free and off-policy learning method.

Q_Learning

Q-Learning is a model-free and off-policy reinforcement learning method. The agent learns by updating Q-values, which represent the expected cumulative rewards for state-action pairs. The Q-values are iteratively refined based on rewards obtained and the maximum expected rewards from subsequent states.

The Q-Learning algorithm is derived from the Bellman equation, which expresses the relationship between the value of a state and the values of its successor states.

Q(S, A) = (1 - α) * Q(S, A) + α [R(S, A) + γ * max Q(S', A')]

This equation iteratively updates Q-values, where α is the learning rate, γ is the discount factor, and max Q(S', A') represents the maximum Q-value for the next state.

Exploration & Exploitation

To develop an optimal policy, the agent needs to balance exploration and exploitation. Exploration involves trying new actions to discover their rewards, while exploitation focuses on selecting actions that yield the highest known rewards. Action selection strategies, such as ε-greedy, play a crucial role in achieving this balance.

Environment Overview 🎮🌀

Here is a more concise summary of the environment's characteristics:

Agent: The environment includes an agent, which is the entity responsible for navigating through the maze. The agent's task is to move from the starting cell to the target cell.
Starting Cell ([0, 0]): The maze has a designated starting cell for the agent, denoted by coordinates [0, 0]. This is the initial position from which the agent begins its journey.
Target Cell ([n, n]): There is a target cell in the maze that serves as the destination for the agent. The target cell is located at coordinates [n, n].
Randomly Generated Walls: The maze environment is designed with randomly generated walls. In each execution, the walls are placed randomly within the maze. These walls create obstacles that the agent must navigate around.
Portal Cells: The maze contains portal cells. When the agent enters a portal cell, it is instantly transported to another cell with the same color. This feature adds a dynamic aspect to the maze navigation.

🎯 The primary task within this environment is to guide the agent from the starting cell to the target cell while taking into account the presence of walls and utilizing portal cells to aid in navigation.

How to Run

Clone the Repository:

https://github.com/SheidaAbedpour/Reinforcement-Learning-in-Random-Mazes.git

To use this environment, after downloading the package, open the terminal inside the downloaded folder and enter the following command:

python setup.py install

Install Dependencies

pip install -r requirements.txt

Run the main.py and feel free to experiment with different hyperparameter values.

NUM_EPISODES: Number of episodes for training
epsilon: Exploration-exploitation trade-off parameter
alpha: Learning rate
gamma: Discount factor

Results Visualization: Check for results using the matplotlib library, including plots of the number of steps and total rewards over episodes, a visualization of the Q-table, and actions in the maze.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
gym-maze		gym-maze
Document.pdf		Document.pdf
README.md		README.md
RL_Document.pdf		RL_Document.pdf
env.PNG		env.PNG
gym-maze.zip		gym-maze.zip
main.py		main.py
maze.ipynb		maze.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning in a Maze Environment using Q-Learning 🤖

Q_Learning

Exploration & Exploitation

Environment Overview 🎮🌀

How to Run

About

Releases

Packages

Languages

SheidaAbedpour/Reinforcement-Learning-in-Random-Mazes

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning in a Maze Environment using Q-Learning 🤖

Q_Learning

Exploration & Exploitation

Environment Overview 🎮🌀

How to Run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages