Text-driven Affordance Learning from Egocentric Vision
Tomoya Yoshida, Shuhei Kurita, Taichi Nishimura, Shinsuke Mori
- 发表年份
- 2024
- 访问权限
- 开放获取
摘要
Visual affordance learning is a key component for robots to understand how to interact with objects. Conventional approaches in this field rely on pre-defined objects and actions, falling short of capturing diverse interactions in realworld scenarios. The key idea of our approach is employing textual instruction, targeting various affordances for a wide range of objects. This approach covers both hand-object and tool-object interactions. We introduce text-driven affordance learning, aiming to learn contact points and manipulation trajectories from an egocentric view following textual instruction. In our task, contact points are represented as heatmaps, and the manipulation trajectory as sequences of coordinates that incorporate both linear and rotational movements for various manipulations. However, when we gather data for this task, manual annotations of these diverse interactions are costly. To this end, we propose a pseudo dataset creation pipeline and build a large pseudo-training dataset: TextAFF80K, consisting of over 80K instances of the contact points, trajectories, images, and text tuples. We extend existing referring expression comprehension models for our task, and experimental results show that our approach robustly handles multiple affordances, serving as a new standard for affordance learning in real-world scenarios.
关键词
相关论文
面向大型复杂构件的移动机器人辅助磨削技术综述
Yusen Li, Ziwei Wang, Xiangye Zhu 等 12 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于物理信息与机器学习的五轴铣削TC4钛合金刀具磨损融合预测模型
Shaoqing Qin, Lida Zhu, Yanpeng Hao 等 10 位作者
Robotics and Computer-Integrated Manufacturing · 2026
通过新型压电主动阻尼刀柄提升机器人铣削质量
Bo Li, Yuanbo Zhao, Huijie Xiao 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
一种利用磁致非线性宽带多向被动减振器抑制机器人铣削低频颤振的新方法
Hao Li, Yuhui Yu, Rui Fu 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026