Multi-Robot Cooperative Target Search Based on Distributed Reinforcement Learning Method in 3D Dynamic Environments
Meng Zhou, Xinheng Wang, Chang Wang, Jing Wang
- Year
- 2024
- Citations
- 3
- Access
- Open access
Abstract
This paper proposes a distributed reinforcement learning method for multi-robot cooperative target search based on policy gradient in 3D dynamic environments. The objective is to find all hostile drones which are considered as targets with the minimal search time while avoiding obstacles. First, the motion model for unmanned aerial vehicles and obstacles in a dynamic 3D environments is presented. Then, a reward function is designed based on environmental feedback and obstacle avoidance. A loss function and its gradient are designed based on the expected cumulative reward and its differentiation. Next, the expected cumulative reward is optimized by a reinforcement learning algorithm that makes the loss function update in the direction of the gradient. When the variance of the expected cumulative reward is lower than a specified threshold, the unmanned aerial vehicle obtains the optimal search policy. Finally, simulation results demonstrate that the proposed method effectively enables unmanned aerial vehicles to identify all targets in the dynamic 3D airspace while avoiding obstacles.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002