Pushing the envelope in deep visual recognition for mobile platforms

Lorenzo Alvino

发表年份: 2017
访问权限: 开放获取

摘要

Image classification is the task of assigning to an input image a label from a fixed set of categories. One of its most important applicative fields is that of robotics, in particular the needing of a robot to be aware of what's around and the consequent exploitation of that information as a benefit for its tasks. In this work we consider the problem of a robot that enters a new environment and wants to understand visual data coming from its camera, so to extract knowledge from them. As main novelty we want to overcome the needing of a physical robot, as it could be expensive and unhandy, so to hopefully enhance, speed up and ease the research in this field. That's why we propose to develop an application for a mobile platform that wraps several deep visual recognition tasks. First we deal with a simple Image classification, testing a model obtained from an AlexNet trained on the ILSVRC 2012 dataset. Several photo settings are considered to better understand which factors affect most the quality of classification. For the same purpose we are interested to integrate the classification task with an extra module dealing with segmentation of the object inside the image. In particular we propose a technique for extracting the object shape and moving out all the background, so to focus the classification only on the region occupied by the object. Another significant task that is included is that of object discovery. Its purpose is to simulate the situation in which the robot needs a certain object to complete one of its activities. It starts searching for what it needs by looking around and trying to understand the location of the object by scanning the surrounding environment. Finally we provide a tool for dealing with the creation of customized task-specific databases, meant to better suit to one's needing in a particular vision task.

关键词

cs.CVcs.LG

Pushing the envelope in deep visual recognition for mobile platforms

摘要

关键词

相关论文

一种面向线弧增材制造的电动汽车结构可制造性拓扑优化的双环框架

几何数字孪生：一种用于航空发动机装配精度预测的数字智能模型

通过人工智能驱动的机器人技术革新产业

新型大口径偏置馈电可展开天线设计与动态性能预测