Based on MARL (Multi-Agent Reinforcement Learning), this project provides a dynamic coverage control algorithm for UAV swarm. Our task is to plan the flight route of the UAV swarm so that all discrete PoIs (Points of Interest) can be covered for a certain period of time.
Considering the particularity of UAV swarm control, this project focuses on analyzing how to maintain the communication connectivity of the swarm during execution.
There are
Problem can be formulated as follow,
, where
,where
We build a dynamic coverage environment based on Multiagent-Particle-Envs.
Class CoverageWorld
inherits from multiagent.core.world
, in whose step()
, the power and energy are calculated, and the PoI state is updated.
multiagent/scenarios/coverage.py
describes the dynamic coverage scenario.
multiagent/render.py
has been modified to display in real time the current power obtained by PoIs and the communication between the UAVs.
Some other changes, such as adding connectivity maintaining constraints, revising action according to constraints, will be mentioned later.
The agent's observations include its own position and velocity, as well as the relative positions of other agents and PoIs. The actions of the agent include forward, backward, left, and right, and keeping still. As a purely cooperative scenario, the rewards of all agents are the same and are set as follows,
,where
If the connectivity is lost at the next moment, connectivity-preserving force is generated between the UAVs that lose connectivity, which satisfies
Proof is omitted.
The trained resulted is displayed as follow, (2 and 3 is under connectivity preservation)
MAPPO-based code in uav_dcc_control
conda create -n dcc python==3.9
pip3 install torch torchvision torchaudio omegaconf wandb
pip install gym==0.10.5
pip install pyglet==1.5.27 # optional for render
python train.py 0 # "0" means cuda:0, if cuda is not available, subject "0" with any int
where sys
variable 0 means calling cuda:0, if cuda is not available, it will use cpu.