SDS -- See it, Do it, Sorted: Quadruped Skill Synthesis from Single Video Demonstration
Μαρία Σταματοπούλου, Jeffrey Li, Dimitrios Kanoulas
- 发表年份
- 2024
- 引用次数
- 2
- 访问权限
- 开放获取
摘要
Imagine a robot learning locomotion skills from any single video, without labels or reward engineering. We introduce SDS ("See it. Do it. Sorted."), an automated pipeline for skill acquisition from unstructured demonstrations. Using GPT-4o, SDS applies novel prompting techniques, in the form of spatio-temporal grid-based visual encoding ($G_{v}$) and structured input decomposition (SUS). These produce executable reward functions (RF) from the raw input videos. The RFs are used to train PPO policies and are optimized through closed-loop evolution, using training footage and performance metrics as self-supervised signals. SDS allows quadrupeds (e.g. Unitree Go1) to learn four gaits -- trot, bound, pace, and hop -- achieving 100% gait matching fidelity, Dynamic Time Warping (DTW) distance in the order of $10^{-6}$, and stable locomotion with zero failures, both in simulation and the real world. SDS generalizes to morphologically different quadrupeds (e.g. ANYmal) and outperforms prior work in data efficiency, training time and engineering effort. Further materials and the code are open-source under: https://rpl-cs-ucl.github.io/SDSweb/.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002