GroundedSurg: A Multi-Procedure Benchmark for Language-Conditioned Surgical Tool Segmentation
Tajamul Ashraf, Abrar Ul Riyaz, Wasif Tak, Tavaheed Tariq, Sonia Yadav, Moloud Abdar, Janibul Bashir
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Clinically reliable perception of surgical scenes is essential for advancing intelligent, context-aware intraoperative assistance such as instrument handoff guidance, collision avoidance, and workflow-aware robotic support. Existing surgical tool benchmarks primarily evaluate category-level segmentation, requiring models to detect all instances of predefined instrument classes. However, real-world clinical decisions often require resolving references to a specific instrument instance based on its functional role, spatial relation, or anatomical interaction capabilities not captured by current evaluation paradigms. We introduce GroundedSurg, the first language-conditioned, instance-level surgical grounding benchmark. Each instance pairs a surgical image with a natural-language description targeting a single instrument, accompanied by structured spatial grounding annotations including bounding boxes and point-level anchors. The dataset spans ophthalmic, laparoscopic, robotic, and open procedures, encompassing diverse instrument types, imaging conditions, and operative complexities. By jointly evaluating linguistic reference resolution and pixel-level localization, GroundedSurg enables a systematic and realistic evaluation of vision-language models in clinically realistic multi-instrument scenes. Extensive experiments demonstrate substantial performance gaps across modern segmentation and VLMs, highlighting the urgent need for clinically grounded vision-language reasoning in surgical AI systems. Code and data are publicly available at https://github.com/gaash-lab/GroundedSurg
关键词
相关论文
机器人技术在整形外科中的应用
Vijay Kumar, Sandhya Pandey
Clinical Journal of Plastic & Reconstructive Surgery · 2026
SurfSurg6D:面向无纹理手术器械的几何一致密集对应位姿估计
Daiyun Shen, Shuojue Yang, Chang Han Low 等 7 位作者
2026
EndoGSim:基于MLLM引导的高斯泼溅的物理感知4D动态内窥镜场景模拟
Changjing Liu, Yiming Huang, Long Bai 等 5 位作者
2026
腹膜后机器人辅助肾输尿管切除术:技术描述与单中心经验
Kawashima A, Ishizuya Y, Yamamoto Y 等 12 位作者
Asian journal of endoscopic surgery · 2026