CollaBot: Vision-Language Guided Simultaneous Collaborative Manipulation

Kun Song, Gaoming Chen, Shentao Ma, Ninglong Jin, Guangbao Zhao, Mingyu Ding, Zhenhua Xiong, Jia Pan

发表年份: 2025
访问权限: 开放获取

摘要

One central goal of robotics is to enable robots to interact with the physical world. Traditional manipulation studies primarily focus on single robots and relatively small objects. However, factory and domestic environments often require large-object manipulation, such as moving tables, where multiple robots must work collaboratively. Existing studies still lack a generalizable framework that can handle diverse objects, tasks, and robot team sizes. In this work, we propose CollaBot, a generalist framework for simultaneous collaborative manipulation. First, we use SEEM for scene segmentation and target-object extraction. Then, we propose a collaborative grasping framework that decomposes the task into local grasp pose generation and global coordination. Finally, we design a two-stage planning module to generate collision-free trajectories for task execution. Experimental results across different settings with varying objects, tasks, and numbers of robots indicate that our framework achieves a 72% success rate. This marks a substantial improvement over behavior cloning-based methods, validating the advantages of the proposed framework in complex multi-robot cooperative tasks. Real-world experiments further demonstrate the feasibility of our method in practical applications.

关键词

cs.RO

CollaBot: Vision-Language Guided Simultaneous Collaborative Manipulation

摘要

关键词

相关论文

A new optimizer using particle swarm theory

Swarm Intelligence

Design and use paradigms for gazebo, an open-source multi-robot simulator

Swarm robotics: a review from the swarm engineering perspective