首页 /研究 /Object Positions Interpretation System for Service Robots Through Targeted Object Marking
PERCEPTION

Object Positions Interpretation System for Service Robots Through Targeted Object Marking

Kosei Yamao, Daiju Kanaoka, Kosei Isomoto, Hakaru Tamukoh

发表年份
2025
引用次数
2

摘要

Service robots are typically required to interpret and execute various complex tasks in home environments. Recognizing the environment, such as furniture, and understanding the relationships between object positions is critical for executing various tasks. Set of mark (SoM) is a visual prompting method that focuses on interpreting the relationship between semantic regions by overlaying marks in each region. However, SoM marks segmented regions that are not objects such as walls and floors. This marking creates noise when interpreting object positions. To address this problem, we propose a novel object-position interpretation system that combines an object detection model and a vision-language model (VLM). The proposed system incorporates an object detection model to mark only objects, allowing the VLM to efficiently interpret object positions. Furthermore, the proposed system improves the accuracy of the system by including the original image and label output by the object detection model in the input to the VLM. The experimental results show that the proposed system outperforms SoM in terms of interpreting object positions.

关键词

Object (grammar)Computer scienceInterpretation (philosophy)RobotService (business)Artificial intelligenceService robotHuman–computer interactionDeep-sky objectComputer vision

相关论文

查看 PERCEPTION 分类全部论文