Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar, David Hsu
- Year
- 2018
- Access
- Open access
Abstract
This paper presents INGRESS, a robot system that follows human natural language instructions to pick and place everyday objects. The core issue here is the grounding of referring expressions: infer objects and their relationships from input images and language expressions. INGRESS allows for unconstrained object categories and unconstrained language expressions. Further, it asks questions to disambiguate referring expressions interactively. To achieve these, we take the approach of grounding by generation and propose a two-stage neural network model for grounding. The first stage uses a neural network to generate visual descriptions of objects, compares them with the input language expression, and identifies a set of candidate objects. The second stage uses another neural network to examine all pairwise relations between the candidates and infers the most likely referred object. The same neural networks are used for both grounding and question generation for disambiguation. Experiments show that INGRESS outperformed a state-of-the-art method on the RefCOCO dataset and in robot experiments with humans.
Keywords
Related papers
Review and perspectives on multimodal perception, mutual cognition, and embodied execution for human–robot collaboration in Industry 5.0
Kai Ding, Qingyuan Mao, Yaqian Zhang +3 more
Robotics and Computer-Integrated Manufacturing · 2026
Towards human-centric manufacturing: Task planning under uncertainties in human–robot collaborative assembly
Yingchao You, Ze Ji, Changyun Wei
Robotics and Computer-Integrated Manufacturing · 2026
Agentic HRC: Achieving context alignment via memory for Human–Robot Collaboration
Jiahui Si, Wenchao Li, Xi Chen +4 more
Robotics and Computer-Integrated Manufacturing · 2026
Adaptive Physics-informed Transformer with Gaussian process residual compensation for inverse dynamics modeling in Human–Robot Collaboration
Rui Qian, Xi Zhang, Dongpeng Li +2 more
Robotics and Computer-Integrated Manufacturing · 2026