首页 /研究 /Robust Reinforcement Learning as a Stackelberg Game via Adaptively-Regularized Adversarial Training

LEARNING

Robust Reinforcement Learning as a Stackelberg Game via Adaptively-Regularized Adversarial Training

Peide Huang, Mengdi Xu, Fei Fang, Ding Zhao

发表年份: 2022
访问权限: 开放获取

摘要

Robust Reinforcement Learning (RL) focuses on improving performances under model errors or adversarial attacks, which facilitates the real-life deployment of RL agents. Robust Adversarial Reinforcement Learning (RARL) is one of the most popular frameworks for robust RL. However, most of the existing literature models RARL as a zero-sum simultaneous game with Nash equilibrium as the solution concept, which could overlook the sequential nature of RL deployments, produce overly conservative agents, and induce training instability. In this paper, we introduce a novel hierarchical formulation of robust RL - a general-sum Stackelberg game model called RRL-Stack - to formalize the sequential nature and provide extra flexibility for robust training. We develop the Stackelberg Policy Gradient algorithm to solve RRL-Stack, leveraging the Stackelberg learning dynamics by considering the adversary's response. Our method generates challenging yet solvable adversarial environments which benefit RL agents' robust learning. Our algorithm demonstrates better training stability and robustness against different testing conditions in the single-agent robotics control and multi-agent highway merging tasks.

关键词

cs.LGcs.AIcs.GT

Robust Reinforcement Learning as a Stackelberg Game via Adaptively-Regularized Adversarial Training

摘要

关键词

相关论文

The Organization of Behavior

Fractional Brownian Motions, Fractional Noises and Applications

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A guide to deep learning in healthcare