GCP-RLTB

This repository presents a hybrid approach combining Reinforcement Learning (RL) with the Tabucol, which is a version of tabu search specifically designed for the Graph Coloring Problem (GCP), enhanced by Graph Neural Networks (GNNs), to tackle the Graph Coloring Problem(GCP).

Graph Coloring Problem (GCP)

Let $G$ be an undirected graph, $G=(V,E)$ with vertex set $V$ and edge set $E$. Now we define a mapping function $f$, where $f: V$ $\mapsto$ {1,2,..., k }. Then the value $f(v)$ of vertex $v$ is the color of $v$. If two adjacent vertices $x$ and $y$ have the same color $j$, then it is called a conflict . A coloring with no conflicts is called a proper-coloring.

The chromatic number of $G$, denoted by $\chi(G)$, is the smallest $k$ for which there exists a $k$-coloring of $G$. Based on these definitions, GCP is the problem of finding the chromatic number of given graph $G$, which is the minimum number of colors required to color the vertices of $G$ such that no two adjacent vertices share the same color.

Action Space

$[N,K]$ array with (vertex, color) pair

self.action_space = spaces.Tuple((spaces.Discrete(self.N), spaces.Discrete(self.k)))

Reward

The default reward for per step is -1e-4. The reward consists of 2 parts, RL reward and Heuristic(Tabucol) reward.

In the RL reward, we consider two main factors, number of conflicts and number of colors used.

The reward function considering these two factors is as follows.

$reward_{RL} = -\lambda * (f(s)-f(s')) + \mu*(C(s)-C(s'))$

# 2 factors: # of conflicts and # of colors used
del_conflicts = conflicts - new_conflicts
del_colors_used = self.n_colors_used - n_colors_used

# reward = -lambda*(f(s)-f(s')) + mu*(C(s)-C(s'))
# We excluded color factor 
_lambda = 0.01
_mu = 0
self.cur_reward = -_lambda * del_conflicts + _mu * del_colors_used

In the Heuristic(Tabucol) reward, it's calculated by the difference between the number of conflicts after $x$ steps and $x+y$ steps. Thus the reward function is as follows.

$reward_{HO} = f(s_x) - f(s_{x+y})$

The final reward considering these 2 rewards is as follows.

$reward = reward_{RL} + reward_{HO}$

Methodology

Deep Q-Network Architecture for Graph Coloring

The Q-Network processes these combined features through multiple layers, including GNN layers for message passing and feature aggregation, followed by fully connected layers. The final output layer produces Q-values corresponding to each (node, color) pair, representing the expected future rewards of selecting that specific action in the current state.

Applying Graph Neural Network

The node and color features are concatenated to form comprehensive (node, color) pair representations. These combined features are then passed through multiple GNN layers, particularly Graph Convolutional Network (GCN) layers, within our Q-Network. In these GCN layers, message passing is performed to aggregate and update feature representations based on the graph's connectivity, effectively capturing both local and global structural information. This process results in an updated feature graph that integrates the nuanced interactions between nodes and colors.

Architecture

Results

(Update soon..)

Code

Our project is primarily developed in Python, with the heuristic algorithm for Tabucol implemented in C++.

The reinforcement learning framework is built using CleanRL and extensively utilizes PyTorch.

Author

Seok-Jin Kwon / IyLias

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
graph_data		graph_data
src		src
README.md		README.md
setup.py		setup.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GCP-RLTB

Graph Coloring Problem (GCP)

Action Space

Reward

Methodology

Deep Q-Network Architecture for Graph Coloring

Applying Graph Neural Network

Architecture

Results

Code

Author

About

Releases

Packages

Languages

IyLias/graph-coloring-RLTB

Folders and files

Latest commit

History

Repository files navigation

GCP-RLTB

Graph Coloring Problem (GCP)

Action Space

Reward

Methodology

Deep Q-Network Architecture for Graph Coloring

Applying Graph Neural Network

Architecture

Results

Code

Author

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages