Home /Research /Yolo+FPN: 2D and 3D Fused Object Detection With an RGB-D Camera
PERCEPTION

Yolo+FPN: 2D and 3D Fused Object Detection With an RGB-D Camera

Ya Wang, Andreas Zell

Year
2021
Citations
8

Abstract

In this paper we propose a new deep neural network system, called Yolo+FPN, which fuses both 2D and 3D object detection algorithms to achieve better real-time object detection results and faster inference speed, to be used on real robots. Finding an optimized fusion strategy to efficiently combine 3D object detection with 2D detection information is useful and challenging for both indoor and outdoor robots. In order to satisfy real-time requirements, a trade-off between accuracy and efficiency is needed. We not only have improved training and test accuracies and lower mean losses on the KITTI object detection benchmark comparing with our baseline method, but also achieve competitive average precision on 3D detection of all classes in three levels of difficulty comparing with other state-of-the-art methods. Also, we implemented Yolo+FPN system using an RGB-D camera, and compared the speed of object detection using different GPUs. For the real implementation of both indoor and outdoor scenes, we focus on person detection, which is the most challenging and important among the three classes.

Keywords

Object detectionArtificial intelligenceComputer scienceBenchmark (surveying)Computer visionObject (grammar)RGB color modelFocus (optics)RobotPattern recognition (psychology)

Related papers

Browse all PERCEPTION papers