首页 /研究 /RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning
MANIPULATION

RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning

Charles Xu, Qiyang Li, Jianlan Luo, Sergey Levine

发表年份
2025
引用次数
5
访问权限
开放获取

摘要

Generalization to Unseen ScenariosComposition for Long Horizon Tasks Fig. 1: RLDG improves generalist robot policies like OpenVLA and Octo by training specialist RL policies and using them to generate high-quality fine-tuning datasets.It has the flexibility to distill knowledge from multiple RL policies trained on individual narrowly scoped tasks into a single generalist.It can also be applied to the most critical sub-task of a long-horizon manipulation task, improving the success rate at the "bottleneck" while leveraging human demonstrations on parts of the task where it suffices.The resulting fine-tuned generalist policies are capable of precise manipulation, generalization to unseen scenarios, and composition of skills to solve long-horizon tasks.

关键词

Reinforcement learningDistillationAction (physics)Control (management)Process (computing)Supervisory control

相关论文

查看 MANIPULATION 分类全部论文