Home /Research /Learning aggressive animal locomotion skills for quadrupedal robots solely from monocular videos
LOCOMOTION

Learning aggressive animal locomotion skills for quadrupedal robots solely from monocular videos

Zhao Liu, Zeren Luo, Yimin Han, Jiahui Zhang, Yuanhao Chen, Yunhui Liu, Peng Lu

Year
2025
Citations
3
Access
Open access

Abstract

The quest for agile quadrupedal robots is limited by handcrafted reward design in reinforcement learning. While animal motion capture provides 3D references, its cost prohibits scaling. Video learning provides an efficient alternative yet suffers from 2D limitations and joint tracking failures during explosive motions. We address this with a novel video-based framework. First, robust 2D pose estimation constructs a skeleton graph model, enabling Kalman-filter-based joint position fusion. Next, a spatial-temporal graph convolution network aggregates spatial pose features via graph convolutions and temporal dynamics through dilated convolutions, recovering 3D joint trajectories. These trajectories are mapped to the robot’s joint space to formulate generative imitation learning. Real-robot deployment demonstrates successful learning of complex motions: gallop (high-speed), tripod (fault-tolerant), bipedal (quadrupedally challenging), and backflip. The proposed framework significantly advances robotic locomotion capabilities.

Keywords

RobotGraphMonocularHumanoid robotExploitQuadrupedalismMotion captureReinforcement learning

Related papers

Browse all LOCOMOTION papers