Multi-Class Object Detection Using 2D Poses

Christopher Mayershofer, Ala Hammami, Johannes Fottner

发表年份: 2020
引用次数: 2

摘要

Object detection (OD) methods are finding application in various fields. The OD problem can be divided into two sub-problems, namely object classification and localization. While the former aims to answer the question what class a given object belongs to, the latter focuses on locating an object within a given image. For localization, both implicit representations, which border the object and its features (e.g. bounding boxes, polygons and masks), and explicit representations, which describe the object's pose in an image (e.g. 6D pose, keypoints), are used. The 2D pose is a simple, yet effective representation that has so far been overlooked. In this paper, we therefore motivate and formulate the use of 2D poses for object localization. Furthermore, we present RetinaNet-2DP, an anchor-based convolutional neural network (CNN) that is capable of detecting objects using 2D poses. To do so, we propose the idea of Anchor Poses and the Gaussian Kernel Distance as a similarity metric between poses. Experiments on the DOTA dataset and two robotics use cases from industry emphasize the performance of the network architecture and more generally demonstrate the potential of the proposed localization representation. Finally, we critically assess our findings and present an outlook of future work.

关键词

Computer scienceClass (philosophy)Object detectionArtificial intelligenceComputer visionObject-class detectionObject (grammar)Pattern recognition (psychology)Face detectionFacial recognition system

Multi-Class Object Detection Using 2D Poses

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory