首页 /研究 /Residual Policy Optimization With Trust Region Constraints: A Learning Framework for Stable and Agile Wheel-Legged Locomotion
LOCOMOTION

Residual Policy Optimization With Trust Region Constraints: A Learning Framework for Stable and Agile Wheel-Legged Locomotion

Naifeng He, Xiaoliang Fan, W. Que, Siyang Liu, Hongyu Xu, Chunguang Bu, Bi Zhang

发表年份
2025
引用次数
1

摘要

Wheel-legged robots integrate the adaptability of legged locomotion with the efficiency of wheeled movement, enabling agile traversal across diverse terrains. However, abrupt terrain transitions introduce substantial state variations, including velocity fluctuations, posture shifts, and slippage, which pose significant challenges to locomotion stability. To address these issues, we propose a state error compensation framework that integrates a residual network with a trust-region mechanism. The residual network implicitly captures nonlinear contact dynamics, enabling real-time correction of slippage-induced state deviations, while the trust-region mechanism regulates compensation amplitude to maintain stable locomotion. Furthermore, we introduce a dual-source contrastive learning strategy, which explicitly differentiates terrain-induced transitions from external perturbations, facilitating context-aware error recovery. The proposed framework is integrated into a model-free reinforcement learning pipeline, ensuring adaptability to previously unseen environments. To further enhance robustness, an uncertainty-aware calibration module is introduced. This module dynamically adjusts the trust region boundary in real time, leveraging sensory feedback to adaptively constrain residual corrections and prevent over-adjustment, thereby maintaining stability during diverse terrain transitions. Experimental results demonstrate that the proposed framework achieves a 96.7% terrain traversal success rate and 92% velocity tracking accuracy under dynamic disturbances. On unstructured and mixed terrains, it maintains a mean velocity tracking error of 0.15 m/s and stable posture, with pitch and roll angles constrained to ±0.04 rad and ±0.02 rad, respectively.

关键词

TerrainResidualAdaptabilityReinforcement learningTree traversalRobotAgile software developmentCompensation (psychology)Tracking error

相关论文

查看 LOCOMOTION 分类全部论文