首页 /研究 /Safe Reinforcement Learning Using Wasserstein Distributionally Robust MPC and Chance Constraint

LEARNING

Safe Reinforcement Learning Using Wasserstein Distributionally Robust MPC and Chance Constraint

Arash Bahari Kordabad, Rafał Wiśniewski, Sébastien Gros

发表年份: 2022
引用次数: 14
访问权限: 开放获取

摘要

In this paper, we address the chance-constrained safe Reinforcement Learning (RL) problem using the function approximators based on Stochastic Model Predictive Control (SMPC) and Distributionally Robust Model Predictive Control (DRMPC). We use Conditional Value at Risk (CVaR) to measure the probability of constraint violation and safety. In order to provide a safe policy by construction, we first propose using parameterized nonlinear DRMPC at each time step. DRMPC optimizes a finite-horizon cost function subject to the worst-case constraint violation in an ambiguity set. We use a statistical ball around the empirical distribution with radius measured by the Wasserstein metric as the ambiguity set. Unlike the sample average approximation SMPC, DRMPC provides a probabilistic guarantee of the out of sample risk and requires lower samples from the disturbance. Then the Q-learning method is used to optimize the parameters in the DRMPC to achieve the best closed-loop performance. Wheeled Mobile Robot (WMR) path planning with obstacle avoidance will be considered to illustrate the efficiency of the proposed method.

关键词

Computer scienceWasserstein metricMathematical optimizationReinforcement learningProbabilistic logicModel predictive controlParameterized complexityBellman equationMotion planningAmbiguity

Safe Reinforcement Learning Using Wasserstein Distributionally Robust MPC and Chance Constraint

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory