Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery
Guankun Wang, Long Bai, Wan Jun Nah, Jie Wang, Zhaoxi Zhang, Zhen Chen, Jinlin Wu, Mobarakol Islam, Hongbin Liu, Hongliang Ren
- 发表年份
- 2024
- 访问权限
- 开放获取
摘要
Recent advancements in Surgical Visual Question Answering (Surgical-VQA) and related region grounding have shown great promise for robotic and medical applications, addressing the critical need for automated methods in personalized surgical mentorship. However, existing models primarily provide simple structured answers and struggle with complex scenarios due to their limited capability in recognizing long-range dependencies and aligning multimodal information. In this paper, we introduce Surgical-LVLM, a novel personalized large vision-language model tailored for complex surgical scenarios. Leveraging the pre-trained large vision-language model and specialized Visual Perception LoRA (VP-LoRA) blocks, our model excels in understanding complex visual-language tasks within surgical contexts. In addressing the visual grounding task, we propose the Token-Interaction (TIT) module, which strengthens the interaction between the grounding module and the language responses of the Large Visual Language Model (LVLM) after projecting them into the latent space. We demonstrate the effectiveness of Surgical-LVLM on several benchmarks, including EndoVis-17-VQLA, EndoVis-18-VQLA, and a newly introduced EndoVis Conversations dataset, which sets new performance standards. Our work contributes to advancing the field of automated surgical mentorship by providing a context-aware solution.
关键词
相关论文
机器人技术在整形外科中的应用
Vijay Kumar, Sandhya Pandey
Clinical Journal of Plastic & Reconstructive Surgery · 2026
SurfSurg6D:面向无纹理手术器械的几何一致密集对应位姿估计
Daiyun Shen, Shuojue Yang, Chang Han Low 等 7 位作者
2026
EndoGSim:基于MLLM引导的高斯泼溅的物理感知4D动态内窥镜场景模拟
Changjing Liu, Yiming Huang, Long Bai 等 5 位作者
2026
腹膜后机器人辅助肾输尿管切除术:技术描述与单中心经验
Kawashima A, Ishizuya Y, Yamamoto Y 等 12 位作者
Asian journal of endoscopic surgery · 2026