Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling
Emile Anand, Ishani Karmarkar
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Many large-scale platforms and networked control systems have a centralized decision maker interacting with a massive population of agents under strict observability constraints. Motivated by such applications, we study a cooperative Markov game with a global agent and $n$ homogeneous local agents in a communication-constrained regime, where the global agent only observes a subset of $k$ local agent states per time step. We propose an alternating learning framework $(\texttt{ALTERNATING-MARL})$, where the global agent performs subsampled mean-field $Q$-learning against a fixed local policy, and local agents update by optimizing in an induced MDP. We prove that these approximate best-response dynamics converge to an $\widetilde{O}(1/\sqrt{k})$-approximate Nash Equilibrium, while separating the sample complexities between the joint state and action spaces. Finally, we validate our results in numerical simulations for multi-robot control.
关键词
相关论文
基于嵌入式语言模型的多机器人系统动态重构
Shokhikha Amalana Murdivien, Jongsu Park, Jumyung Um
Robotics and Computer-Integrated Manufacturing · 2026
基于大语言模型增强的多智能体强化学习的无人机博弈分层决策
Xinyu Dong, Bo Li, Guangyu Zhang 等 5 位作者
Aerospace Science and Technology · 2026
水下残骸区域多UUV协同覆盖搜索的编队优化与避碰决策方法
Haomiao Yu, Zeyuan Zhang, Yantian Ma
Robotics and Autonomous Systems · 2026
人在回路中的群体机器人:一种用于真实土壤测绘的仿生群体方法
Petras Swissler, Mohammadali Rashidioun, Nicholas Sahu 等 6 位作者
2026