INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
Hanbo Zhang, Yunfan Lu, Cunjun Yu, David Hsu, Xuguang Lan, Nanning Zheng
- Year
- 2021
- Access
- Open access
Abstract
This paper presents INVIGORATE, a robot system that interacts with human through natural language and grasps a specified object in clutter. The objects may occlude, obstruct, or even stack on top of one another. INVIGORATE embodies several challenges: (i) infer the target object among other occluding objects, from input language expressions and RGB images, (ii) infer object blocking relationships (OBRs) from the images, and (iii) synthesize a multi-step plan to ask questions that disambiguate the target object and to grasp it successfully. We train separate neural networks for object detection, for visual grounding, for question generation, and for OBR detection and grasping. They allow for unrestricted object categories and language expressions, subject to the training datasets. However, errors in visual perception and ambiguity in human languages are inevitable and negatively impact the robot's performance. To overcome these uncertainties, we build a partially observable Markov decision process (POMDP) that integrates the learned neural network modules. Through approximate POMDP planning, the robot tracks the history of observations and asks disambiguation questions in order to achieve a near-optimal sequence of actions that identify and grasp the target object. INVIGORATE combines the benefits of model-based POMDP planning and data-driven deep learning. Preliminary experiments with INVIGORATE on a Fetch robot show significant benefits of this integrated approach to object grasping in clutter with natural language interactions. A demonstration video is available at https://youtu.be/zYakh80SGcU.
Keywords
Related papers
State-of-the-art in mobile robot-assisted grinding technologies for large-scale complex components
Yusen Li, Ziwei Wang, Xiangye Zhu +9 more
Robotics and Computer-Integrated Manufacturing · 2026
A fusion prediction model of tool wear based on physical information and machine learning in five-axis milling TC4 titanium alloy
Shaoqing Qin, Lida Zhu, Yanpeng Hao +7 more
Robotics and Computer-Integrated Manufacturing · 2026
Enhancing robotic milling quality via a novel piezoelectric active damping toolholder
Bo Li, Yuanbo Zhao, Huijie Xiao +3 more
Robotics and Computer-Integrated Manufacturing · 2026
A novel method of suppressing low-frequency chatter in robotic milling using magnetically-induced nonlinear broadband multidirectional passive vibration absorber
Hao Li, Yuhui Yu, Rui Fu +3 more
Robotics and Computer-Integrated Manufacturing · 2026