6-DoF Grasp Detection Method Based on Vision Language Guidance
Xixing Li, Rui Wu, Tao Liu
- 发表年份
- 2025
- 引用次数
- 1
- 访问权限
- 开放获取
摘要
The interactive grasp of robots can grasp the corresponding objects according to the user’s choice. Most interactive grasp methods based on deep learning comprise visual language and grasp detection models. However, in existing methods, the trainability and generalization ability of the visual language model is weak, and the robot cannot cope well with grasping small target objects. Therefore, this paper proposes a 6-DoF grasp detection method guided by visual language, which converts text instructions and RGBD images of the scene to be grasped into inputs and outputs for the 6-DoF grasp posture of the object corresponding to the text instructions. In order to improve the trainability and feature extraction ability of the visual language model, a multi-head attention mechanism combined with hybrid normalization is designed. At the same time, a local attention mechanism is introduced into the grasp detection model to enhance the global and local information interaction ability of point cloud data, thereby improving the grasping ability of the grasp detection model for small target objects. The method proposed in this paper first uses the improved visual language model to predict the plane position information of the target object, then uses the improved grasp detection model to predict all the graspable postures in the scene, and finally uses the plane position information to filter out the graspable postures of the target object. The visual language model and grasp detection model proposed in this paper have achieved excellent performance in various scenarios of public datasets while ensuring a specific generalization ability. In addition, we also conducted real grasp experiments, and the 6-DoF grasp detection method based on visual language guidance proposed in this paper achieved a grasp success rate of 95%.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002