SUB-PLAY: Adversarial Policies against Partially Observed Multi-Agent Reinforcement Learning Systems
Oubo Ma, Yuwen Pu, Linkang Du, Yang Dai, Ruo Wang, Xiaolei Liu, Yingcai Wu, Shouling Ji
- 发表年份
- 2024
- 访问权限
- 开放获取
摘要
Recent advancements in multi-agent reinforcement learning (MARL) have opened up vast application prospects, such as swarm control of drones, collaborative manipulation by robotic arms, and multi-target encirclement. However, potential security threats during the MARL deployment need more attention and thorough investigation. Recent research reveals that attackers can rapidly exploit the victim's vulnerabilities, generating adversarial policies that result in the failure of specific tasks. For instance, reducing the winning rate of a superhuman-level Go AI to around 20%. Existing studies predominantly focus on two-player competitive environments, assuming attackers possess complete global state observation. In this study, we unveil, for the first time, the capability of attackers to generate adversarial policies even when restricted to partial observations of the victims in multi-agent competitive environments. Specifically, we propose a novel black-box attack (SUB-PLAY) that incorporates the concept of constructing multiple subgames to mitigate the impact of partial observability and suggests sharing transitions among subpolicies to improve attackers' exploitative ability. Extensive evaluations demonstrate the effectiveness of SUB-PLAY under three typical partial observability limitations. Visualization results indicate that adversarial policies induce significantly different activations of the victims' policy networks. Furthermore, we evaluate three potential defenses aimed at exploring ways to mitigate security threats posed by adversarial policies, providing constructive recommendations for deploying MARL in competitive environments.
关键词
相关论文
基于大语言模型增强的多智能体强化学习的无人机博弈分层决策
Xinyu Dong, Bo Li, Guangyu Zhang 等 5 位作者
Aerospace Science and Technology · 2026
基于嵌入式语言模型的多机器人系统动态重构
Shokhikha Amalana Murdivien, Jongsu Park, Jumyung Um
Robotics and Computer-Integrated Manufacturing · 2026
水下残骸区域多UUV协同覆盖搜索的编队优化与避碰决策方法
Haomiao Yu, Zeyuan Zhang, Yantian Ma
Robotics and Autonomous Systems · 2026
人在回路中的群体机器人:一种用于真实土壤测绘的仿生群体方法
Petras Swissler, Mohammadali Rashidioun, Nicholas Sahu 等 6 位作者
2026