Monocular 3D Tooltip Tracking in Robotic Surgery—Building a Multi-Stage Pipeline
Sanjeev Narasimhan, Mehmet Kerem Türkcan, Mattia Ballo, Sarah Choksi, Filippo Filicori, Zoran Kostić
- 发表年份
- 2025
- 引用次数
- 5
- 访问权限
- 开放获取
摘要
Tracking the precise movement of surgical tools is essential for enabling automated analysis, providing feedback, and enhancing safety in robotic-assisted surgery. Accurate 3D tracking of surgical tooltips is challenging to implement when using monocular videos due to the complexity of extracting depth information. We propose a pipeline that combines state-of-the-art foundation models—Florence2 and Segment Anything 2 (SAM2)—for zero-shot 2D localization of tooltip coordinates using a monocular video input. Localization predictions are refined through supervised training of the YOLOv11 segmentation model to enable real-time applications. The depth estimation model Metric3D computes the relative depth and provides tooltip camera coordinates, which are subsequently transformed into world coordinates via a linear model estimating rotation and translation parameters. An experimental evaluation on the JIGSAWS Suturing Kinematic dataset achieves a 3D Average Jaccard score on tooltip tracking of 84.5 and 91.2 for the zero-shot and supervised approaches, respectively. The results validate the effectiveness of our approach and its potential to enhance real-time guidance and assessment in robotic-assisted surgical procedures.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002