Home /Research /Hover-Bvi: Handover Vision-Language Embodied Robot System for Bvi Users
MANIPULATION

Hover-Bvi: Handover Vision-Language Embodied Robot System for Bvi Users

Narasimha Bonda, Qin Chenxin, Yukiko Iwasaki, Hiroyasu Iwata

Year
2025
Citations
1

Abstract

We present HOVER-BVI, a robot system designed to help blind and visually-impaired (BVI) users by dynamically planning and executing grasping tasks in indoor environments. Unlike traditional rigid assistive robots, our approach uses a large language model (LLM) as a flexible task planner combined with a hierarchical state machine system to sequentially execute action primitive functions. The system maintains an object memory, generates natural-language scene descriptions of the workspace, and interacts via speech. By leveraging an LLM for high-level planning, the system interprets diverse voice commands and compiles them into a sequence of primitives, making dynamic tasks possible. To validate the system, we designed experiments with blindfolded users to evaluate the system's effectiveness in tasks such as object search, description, and retrieval. Preliminary analysis suggests that in a semantically constrained indoor setting, the LLM-based planner can achieve high success rates while providing intuitive user interaction. We highlight the system's modularity and extensibility and discuss usability considerations for real-world blind users. This work takes LLM-task planning in general situations (e.g. in RoboCup GPSR) and adapts it to the BVI domain, exploiting the semantically rich, yet structured indoor context to improve reliability and user trust in assistive robotics.

Keywords

Embodied cognitionHandoverComputer scienceRobotHuman–computer interactionMobile robotArtificial intelligenceTelecommunications

Related papers

Browse all MANIPULATION papers