Physics-model-guided Worst-case Sampling for Safe Reinforcement Learning
Hongpeng Cao, Yanbing Mao, Lui Sha, Marco Caccamo
- Year
- 2024
- Access
- Open access
Abstract
Real-world accidents in learning-enabled CPS frequently occur in challenging corner cases. During the training of deep reinforcement learning (DRL) policy, the standard setup for training conditions is either fixed at a single initial condition or uniformly sampled from the admissible state space. This setup often overlooks the challenging but safety-critical corner cases. To bridge this gap, this paper proposes a physics-model-guided worst-case sampling strategy for training safe policies that can handle safety-critical cases toward guaranteed safety. Furthermore, we integrate the proposed worst-case sampling strategy into the physics-regulated deep reinforcement learning (Phy-DRL) framework to build a more data-efficient and safe learning algorithm for safety-critical CPS. We validate the proposed training strategy with Phy-DRL through extensive experiments on a simulated cart-pole system, a 2D quadrotor, a simulated and a real quadruped robot, showing remarkably improved sampling efficiency to learn more robust safe policies.
Keywords
Related papers
Trajectory tracking control for 6WID/4WIS UGV via nonlinear sliding mode-model predictive control with adaptive following steering and dynamic-static constraints
Shengyang Lu, Guanpeng Chen, Lijing Zhao +2 more
Robotics and Autonomous Systems · 2026
Bioinspired underwater robotics: Advances across the materials, design, control, and applications
Dilip Muchhala, Pramod Kumar Maurya, Adarsh Raut +3 more
Robotics and Autonomous Systems · 2026
Modeling and control of a rigid–soft hybrid-link humanoid robot
Zewen He, Taiki Ishigaki, Ko Yamamoto
Robotics and Autonomous Systems · 2026
Artificial pushing adaptive coordinated control for the human-exoskeleton-walker system
Xinhao Zhang, Chen Yang, Chaobin Zou +4 more
Robotics and Autonomous Systems · 2026