Home /Research /GraspCorrect: Robotic Grasp Correction via Vision-Language Model-Guided Feedback
MANIPULATION

GraspCorrect: Robotic Grasp Correction via Vision-Language Model-Guided Feedback

Sungjae Lee, Yeonjoo Hong, Kwang In Kim

Year
2025
Access
Open access

Abstract

Despite significant advancements in robotic manipulation, achieving consistent and stable grasping remains a fundamental challenge, often limiting the successful execution of complex tasks. Our analysis reveals that even state-of-the-art policy models frequently exhibit unstable grasping behaviors, leading to failure cases that create bottlenecks in real-world robotic applications. To address these challenges, we introduce GraspCorrect, a plug-and-play module designed to enhance grasp performance through vision-language model-guided feedback. GraspCorrect employs an iterative visual question-answering framework with two key components: grasp-guided prompting, which incorporates task-specific constraints, and object-aware sampling, which ensures the selection of physically feasible grasp candidates. By iteratively generating intermediate visual goals and translating them into joint-level actions, GraspCorrect significantly improves grasp stability and consistently enhances task success rates across existing policy models in the RLBench and CALVIN datasets.

Keywords

cs.AIcs.RO

Related papers

Browse all MANIPULATION papers