首页 /研究 /MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

PERCEPTION

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

Xuesong Chen, Shaoshuai Shi, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li

发表年份: 2022
访问权限: 开放获取

摘要

Accurate and reliable 3D detection is vital for many applications including autonomous driving vehicles and service robots. In this paper, we present a flexible and high-performance 3D detection framework, named MPPNet, for 3D temporal object detection with point cloud sequences. We propose a novel three-hierarchy framework with proxy points for multi-frame feature encoding and interactions to achieve better detection. The three hierarchies conduct per-frame feature encoding, short-clip feature fusion, and whole-sequence feature aggregation, respectively. To enable processing long-sequence point clouds with reasonable computational resources, intra-group feature mixing and inter-group feature attention are proposed to form the second and third feature encoding hierarchies, which are recurrently applied for aggregating multi-frame trajectory features. The proxy points not only act as consistent object representations for each frame, but also serve as the courier to facilitate feature interaction between frames. The experiments on large Waymo Open dataset show that our approach outperforms state-of-the-art methods with large margins when applied to both short (e.g., 4-frame) and long (e.g., 16-frame) point cloud sequences. Code is available at https://github.com/open-mmlab/OpenPCDet.

关键词

cs.CV

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

摘要

关键词

相关论文

Artificial intelligence: a modern approach

Are we ready for autonomous driving? The KITTI vision benchmark suite

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Vision meets robotics: The KITTI dataset