Home /Research /Dense symmetric temporal alignment learning for human pose estimation
OTHER

Dense symmetric temporal alignment learning for human pose estimation

Guang Xv, Xingchen Wu

Year
2025
Citations
1
Access
Open access

Abstract

Human pose estimation aims to locate the human joint positions from images or videos. This problem has drawn increasing attention and wide applications in autonomous driving, motion analysis, and intelligence robotics. Some existing works aggregate movement features from neighbouring frames, which is instrumental in capturing sufficient information. However, considering the fast motion and pose occlusion in videos, directly incorporating unaligned additional visual cues from adjacent frames is prone to introduce noises due to the significant differences in inter-frame characteristics. In this article, we advocate executing adequate feature alignment between the keyframe and supporting frames to better utilize neighboring frame contexts. Towards this end, we propose a novel symmetric U-Net-like feature alignment algorithm for the human pose estimation task. This algorithm learns symmetric information at global and local levels for each scale separately to assist the model in generating accurate results. Specifically, a global alignment block based on temporal deformable convolution is designed to learn the complex temporal dynamics between adjacent and current frames to align the features. Moreover, a local alignment block based on adaptive convolution is presented to optimize the feature information further and preserve the geometry structures. Coupling these two modules into a U-Net-like symmetric architecture forms our framework. We show the effectiveness of our algorithm through the excellent results on two large pose estimation benchmark datasets: PoseTrack2017 and PoseTrack2018. In addition, we demonstrate that the proposed model achieves state-of-the-art performance on the self-built badminton dataset.

Keywords

PoseBlock (permutation group theory)Articulated body pose estimationFeature (linguistics)Convolution (computer science)Benchmark (surveying)Motion estimationFrame (networking)Motion (physics)

Related papers

Browse all OTHER papers