Home /Research /AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation
PERCEPTION

AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation

Nitin Saini, Elia Bonetto, Eric Price, Aamir Ahmad, Michael J. Black

Year
2022
Citations
35
Access
Open access

Abstract

In this letter, we present a novel markerless 3D human motion capture (MoCap) system for unstructured, outdoor environments that uses a team of autonomous unmanned aerial vehicles (UAVs) with on-board RGB cameras and computation. Existing methods are limited by calibrated cameras and off-line processing. Thus, we present the first method (AirPose) to estimate human pose and shape using images captured by multiple extrinsically <b>uncalibrated</b> flying cameras. AirPose itself calibrates the cameras <i>relative to the person</i> instead of relying on any pre-calibration. It uses distributed neural networks running on each UAV that communicate <i>viewpoint-independent</i> information with each other about the person (i.e., their 3D shape and articulated pose). The person&#x2019;s shape and pose are parameterized using the SMPL-X body model, resulting in a compact representation, that minimizes communication between the UAVs. The network is trained using synthetic images of realistic virtual environments, and fine-tuned on a small set of real images. We also introduce an optimization-based post-processing method (AirPose<inline-formula><tex-math notation="LaTeX">$^{+}$</tex-math></inline-formula>) for offline applications that require higher MoCap quality. We make our method&#x2019;s code and data available for research at <uri>https://github.com/robot-perception-group/AirPose</uri>. A video describing the approach and results is available at <uri>https://youtu.be/xLYe1TNHsfs</uri>.

Keywords

Computer scienceArtificial intelligenceComputer visionPoseRGB color modelRoboticsRepresentation (politics)Pipeline (software)Set (abstract data type)Motion capture

Related papers

Browse all PERCEPTION papers