首页 /研究 /Observe Then Act: Asynchronous Active Vision-Action Model for Robotic Manipulation
MANIPULATION

Observe Then Act: Asynchronous Active Vision-Action Model for Robotic Manipulation

Guokang Wang, Hang Li, Shuyuan Zhang, Di Guo, Yanhong Liu, Huaping Liu

发表年份
2024
访问权限
开放获取

摘要

In real-world scenarios, many robotic manipulation tasks are hindered by occlusions and limited fields of view, posing significant challenges for passive observation-based models that rely on fixed or wrist-mounted cameras. In this paper, we investigate the problem of robotic manipulation under limited visual observation and propose a task-driven asynchronous active vision-action model.Our model serially connects a camera Next-Best-View (NBV) policy with a gripper Next-Best Pose (NBP) policy, and trains them in a sensor-motor coordination framework using few-shot reinforcement learning. This approach allows the agent to adjust a third-person camera to actively observe the environment based on the task goal, and subsequently infer the appropriate manipulation actions.We trained and evaluated our model on 8 viewpoint-constrained tasks in RLBench. The results demonstrate that our model consistently outperforms baseline algorithms, showcasing its effectiveness in handling visual constraints in manipulation tasks.

关键词

cs.ROcs.CV

相关论文

查看 MANIPULATION 分类全部论文