Efficient deep learning methods for human pose estimation
Umer Rafi
- 发表年份
- 2018
- 引用次数
- 2
- 访问权限
- 开放获取
摘要
Human pose estimation is a very active research area in the field of computer vision. The goal is to infer the body pose of highly articulated people in images and videos. It has applications in gaming, human computer interaction, analysis, image retrieval, robotics and autonomous driving. In the past few years, impressive success has been achieved on the task of human pose estimation in uncontrolled environments. The success can be mainly attributed to deep learning. However, the trend is to push for maximum performance by using computationally expensive deep learning methods. Despite good performance, the methods require high end graphical processing units for efficient performance. The recent increase in deployment of service robots and autonomous vehicles has raised the demand for human pose estimation methods that run efficiently and robustly on low end graphics processing units that are mostly available on robots and autonomous vehicles. In this thesis, we propose efficient and robust deep learning methods for human pose estimation. We start with 3D human pose estimation from depth data. We build on a regression forest method to handle partial occlusion from objects without any increase in method's complexity. Our proposed method can be used in indoor scenarios to estimate body poses of people partially occluded by objects. We then move to 2D single person pose estimation from RGB images. We propose an efficient and robust deep learning method for 2D single person pose estimation. The proposed method achieves comparable performance to the recent state-of-the-art 2D single person pose estimation methods. Our proposed method runs efficiently on low-end graphics processing units. We then approach the task of multi-person pose estimation. We propose a bottom up multi-person method with a novel pairwise relative offsets based association. Our approach simultaneously predicts body parts and relative offsets maps respectively. The relative offsets are used to associate body joints with a greedy procedure. Compared to state-of-the-art bottom up approaches the proposed method does not require any expensive post-processing. Finally, we propose a correspondence matching method with an efficient correlation layer. The correlation layer efficiently computes similarity heat-maps for all the query key points over the target image. The method is applied to the tasks of semantic key points matching, dense matching and instance key points matching. In the context of multi-person pose tracking, we apply instance key points matching for associating body joints over time.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002