CLIPUNetr: Assisting Human-robot Interface for Uncalibrated Visual Servoing Control with CLIP-driven Referring Expression Segmentation
Chen Jiang, Yuchen Yang, Martin Jagersand
- Year
- 2023
- Access
- Open access
Abstract
The classical human-robot interface in uncalibrated image-based visual servoing (UIBVS) relies on either human annotations or semantic segmentation with categorical labels. Both methods fail to match natural human communication and convey rich semantics in manipulation tasks as effectively as natural language expressions. In this paper, we tackle this problem by using referring expression segmentation, which is a prompt-based approach, to provide more in-depth information for robot perception. To generate high-quality segmentation predictions from referring expressions, we propose CLIPUNetr - a new CLIP-driven referring expression segmentation network. CLIPUNetr leverages CLIP's strong vision-language representations to segment regions from referring expressions, while utilizing its ``U-shaped'' encoder-decoder architecture to generate predictions with sharper boundaries and finer structures. Furthermore, we propose a new pipeline to integrate CLIPUNetr into UIBVS and apply it to control robots in real-world environments. In experiments, our method improves boundary and structure measurements by an average of 120% and can successfully assist real-world UIBVS control in an unstructured manipulation environment.
Keywords
Related papers
Review and perspectives on multimodal perception, mutual cognition, and embodied execution for human–robot collaboration in Industry 5.0
Kai Ding, Qingyuan Mao, Yaqian Zhang +3 more
Robotics and Computer-Integrated Manufacturing · 2026
Towards human-centric manufacturing: Task planning under uncertainties in human–robot collaborative assembly
Yingchao You, Ze Ji, Changyun Wei
Robotics and Computer-Integrated Manufacturing · 2026
Agentic HRC: Achieving context alignment via memory for Human–Robot Collaboration
Jiahui Si, Wenchao Li, Xi Chen +4 more
Robotics and Computer-Integrated Manufacturing · 2026
Adaptive Physics-informed Transformer with Gaussian process residual compensation for inverse dynamics modeling in Human–Robot Collaboration
Rui Qian, Xi Zhang, Dongpeng Li +2 more
Robotics and Computer-Integrated Manufacturing · 2026