Early Fusion for Goal Directed Robotic Vision

Aaron Walsman, Yonatan Bisk, Saadia Gabriel, Dipendra Misra, Yoav Artzi, Yejin Choi, Dieter Fox

Year: 2018
Access: Open access

Abstract

Building perceptual systems for robotics which perform well under tight computational budgets requires novel architectures which rethink the traditional computer vision pipeline. Modern vision architectures require the agent to build a summary representation of the entire scene, even if most of the input is irrelevant to the agent's current goal. In this work, we flip this paradigm, by introducing EarlyFusion vision models that condition on a goal to build custom representations for downstream tasks. We show that these goal specific representations can be learned more quickly, are substantially more parameter efficient, and more robust than existing attention mechanisms in our domain. We demonstrate the effectiveness of these methods on a simulated robotic item retrieval problem that is trained in a fully end-to-end manner via imitation learning.

Keywords

cs.CVcs.RO

Early Fusion for Goal Directed Robotic Vision

Abstract

Keywords

Related papers

How to Relieve Distribution Shifts in Semantic Segmentation for Off-Road Environments

Uncertainty-guided evolvable recognition framework for industrial robots via prototype-based fuzzy inference and evidence fusion

Point cloud registration for non-destructive, high-resolution coating thickness measurement from 3D scans

Toward the intelligent robotics era: Multimodal flexible haptic sensors for advanced perception systems