Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control
Christian Schröder de Witt, Bei Peng, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson
- Year
- 2020
- Citations
- 41
Abstract
Centralised training with decentralised execution (CTDE) is an important learning paradigm in multi-agent reinforcement learning (MARL). To make progress in CTDE, we introduce Multi-Agent MuJoCo (MAMuJoCo), a novel benchmark suite that, unlike StarCraft Multi-Agent Challenge (SMAC), the predominant benchmark environment, applies to continuous robotic control tasks. To demonstrate the utility of MAMuJoCo, we present a range of benchmark results on this new suite, including comparing the state-of-the-art actor-critic method MADDPG against two novel variants of existing methods. These new methods outperform MADDPG on a number of MAMuJoCo tasks. In addition, we show that, in these continuous cooperative MAMuJoCo tasks, value factorisation plays a greater role in performance than the underlying algorithmic choices. This motivates the necessity of extending the study of value factorisations from $Q$-learning to actor-critic algorithms.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002