Shield-Loco: Shielding Locomotion Policies with Predictive Safety Filtering
Aditya Shirwatkar, Sebastian Sanokowski, Shishir Kolathaya, Aaron Johnson, Majid Khadiv
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Reinforcement learning (RL) policies enable dynamic legged locomotion but lack mechanisms to avoid violations of safety constraints that are absent during training. Large-scale offline safe learning is impractical for covering all edge cases. Existing safety frameworks either rely on reduced-order models that cannot reason about whole-body behaviors or require conservative recovery controllers that degrade task performance. We propose a predictive safety filter that post-hoc filters the nominal contact locations fed to the RL policy. When a collision is predicted, a sampling-based optimizer asynchronously searches for safer contact sequences using a full-physics model, while a learned value function bootstraps long-horizon returns. Our three algorithmic components (geometric projection of sampled contacts, momentum-augmented updates, and replica-exchange) make the optimization tractable in a discontinuous contact landscape. We validate the filter on a quadruped robot in dense, cluttered environments, both in simulation and in the real world, showing substantial reductions in safety violations with minimal deviation from the nominal input.
关键词
相关论文
Trust Region Policy Optimization
John Schulman, Sergey Levine, Philipp Moritz 等 5 位作者
2015
Legged Robots That Balance
Marc H. Raibert, Ernest R. Tello
1986
Being there: putting brain, body, and world together again
1997
Small-scale soft-bodied robot with multimodal locomotion
Wenqi Hu, Guo Zhan Lum, Massimo Mastrangeli 等 4 位作者
2018