Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization
Ruixiao Xu, Jingqiao Xiu, Yuwei Zheng, Pu Feng, Yuqing Ma, Bo An, Yaodong Yang, Xianglong Liu
- 发表年份
- 2025
- 引用次数
- 3
摘要
In cooperative multi-agent reinforcement learning (MARL), ensuring robustness against cooperative agents making unpredictable or worst-case adversarial actions is crucial for real-world deployment. In multi-agent settings, each agent may be perturbed or unperturbed, leading to an exponential increase in potential threat scenarios as the number of agents grows. Existing robust MARL methods either enumerate, or approximate all possible threat scenarios, leading to intense computation and insufficient robustness. In contrast, humans develop robust behaviors by maintaining a general level of caution rather than preparing for every possible threat. Inspired by human decision making, we frame robust MARL as a control-as-inference problem, and optimize worst-case robustness across all threat scenarios implicitly optimized through off-policy evaluation. Specifically, we introduce mutual information regularization as robust regularization (MIR3), which maximizes a lower bound on robustness during routine training, serving as a kind of caution for MARL without adversarial inputs. Further insights show that MIR3 acts as an information bottleneck, preventing agents from over-reacting to others and aligning policies with robust action priors. In the presence of worst-case adversaries, our MIR3 significantly surpasses baseline methods in robustness and training efficiency, and maintaining cooperative performance in StarCraft II, quadrotor swarm control, and robot swarm control. When deploying the robot swarm control algorithm in the real world, our method also outperforms the best baseline by 14.29% in reward. See code and demo videos at https://github.com/DIG-Beihang/MIR3.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991