Learning aggressive animal locomotion skills for quadrupedal robots solely from monocular videos
Zhao Liu, Zeren Luo, Yimin Han, Jiahui Zhang, Yuanhao Chen, Yunhui Liu, Peng Lu
- Year
- 2025
- Citations
- 3
- Access
- Open access
Abstract
The quest for agile quadrupedal robots is limited by handcrafted reward design in reinforcement learning. While animal motion capture provides 3D references, its cost prohibits scaling. Video learning provides an efficient alternative yet suffers from 2D limitations and joint tracking failures during explosive motions. We address this with a novel video-based framework. First, robust 2D pose estimation constructs a skeleton graph model, enabling Kalman-filter-based joint position fusion. Next, a spatial-temporal graph convolution network aggregates spatial pose features via graph convolutions and temporal dynamics through dilated convolutions, recovering 3D joint trajectories. These trajectories are mapped to the robot’s joint space to formulate generative imitation learning. Real-robot deployment demonstrates successful learning of complex motions: gallop (high-speed), tripod (fault-tolerant), bipedal (quadrupedally challenging), and backflip. The proposed framework significantly advances robotic locomotion capabilities.
Keywords
Related papers
Artificial intelligence: a modern approach
1995
Self-Organizing Maps
Teuvo Kohonen
1995
Vision meets robotics: The KITTI dataset
Andreas Geiger, Philip Lenz, Christoph Stiller +1 more
2013
Probabilistic robotics
Sebastian Thrun
2002