首页 /研究 /弥合差距：实现高性能腿式运动的软演员-评论家算法

LOCOMOTION

弥合差距：实现高性能腿式运动的软演员-评论家算法

Gianluca Sabatini, Chenhao Li, Marco Hutter

发表年份: 2026
访问权限: 开放获取

摘要

本文揭示了软演员-评论家（SAC）算法在大规模并行训练中性能不如近端策略优化（PPO）的根本原因，并提出了策略初始化、超时感知评论家目标及多步回报估计等针对性改进，使得SAC在多种腿式机器人平台上完全弥合了与PPO的性能差距。

关键词

Soft Actor-Criticlegged locomotionsim-to-realsample efficiencyreinforcement learning

相关论文

LOCOMOTION

开放获取📊 3,141 引用

Trust Region Policy Optimization

John Schulman, Sergey Levine, Philipp Moritz 等 5 位作者

2015

📄 PDF 详情 →

LOCOMOTION

📊 2,724 引用

Legged Robots That Balance

Marc H. Raibert, Ernest R. Tello

1986

LOCOMOTION

📊 2,658 引用

Being there: putting brain, body, and world together again

1997

LOCOMOTION

📊 2,305 引用

Small-scale soft-bodied robot with multimodal locomotion

Wenqi Hu, Guo Zhan Lum, Massimo Mastrangeli 等 4 位作者

2018

查看 LOCOMOTION 分类全部论文