Learning Communication for Cooperation in Dynamic Agent-Number Environment
Weiwei Liu, Shanqi Liu, Junjie Cao, Qi Wang, Xiaolei Lang, Yong Liu
- 发表年份
- 2021
- 引用次数
- 13
摘要
The number of agents in many multiagent systems in the real world, such as storage robots and drone cluster systems, continually changes. Still, most current multiagent reinforcement learning (RL) algorithms are limited to fixed network dimensions, and prior knowledge is used to preset the number of agents in the training phase, which leads to a poor generalization of the algorithm. In addition, these algorithms use centralized training to solve the instability problem of multiagent systems. However, the centralized learning of large-scale multiagent RL algorithms will lead to an explosion of network dimensions, which in turn leads to very limited scalability of centralized learning algorithms. To solve these two difficulties, in this article propose a group centralized training and decentralized execution-unlimited dynamic agent-number network (GCTDE-UDAN). First, since we use the attention mechanism to select several leaders and establish a dynamic number of teams, and the UDAN performs a nonlinear combination of all agents' Q values when performing value decomposition, it is not affected by changes in the number of agents. Moreover, our algorithm can unite any agent to form a group and conduct centralized training within the group, avoiding network dimension explosion caused by the global centralized training of large-scale agents. Finally, we verified on the simulation and experimental platform that the algorithm can learn and perform cooperative behaviors in many dynamic multiagent environments.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991