Eye gaze-controlled camera navigation for enhanced robotic surgery with potential cognitive load reduction
N Venkata, D Saranyaraj, Pujeth Potturi
- 发表年份
- 2025
- 引用次数
- 2
摘要
This study explores eye-gaze detection to enhance robotic surgery by enabling dynamic, autonomous camera control, reducing cognitive load during complex procedures. Conventional robotic systems, such as da Vinci and ZEUS, require surgeons to manually adjust the camera while operating robotic arms, splitting attention and increasing cognitive workload, as evidenced by an 18.4% rise in pupil dilation under high-stress conditions. We propose a hands-free solution using a Raspberry Pi 3 and a custom-trained YOLOv9 (You Only Look Once) model, trained over 100 epochs to detect five gaze directions (left, right, up, down, center) with a mean average precision (mAP50) of 90.78%. While this accuracy is notable, its advantages are best understood against other gaze-tracking methods: traditional infrared trackers (e.g., Tobii) achieve higher accuracy (95-98%) but are costly and intrusive, relying on specialized hardware impractical for resource-constrained surgical settings, whereas our camera-based YOLOv9 approach is cost-effective and non-intrusive; compared to Faster R-CNN (Region-based Convolutional Neural Networks) (90-95% accuracy, 5-10 Frames per Second (FPS)), YOLOv9’s real-time inference (up to 60 FPS) excels in dynamic environments, and unlike SSD (85-90% accuracy, 40-50 FPS), it offers superior precision. Comparative analysis with YOLO variants (e.g., YOLOv11, YOLOv8) confirms YOLOv9’s superior accuracy (90.78% mAP50) and real-time efficiency (60 FPS) on modest hardware, enabling precise camera alignment with the surgeon’s gaze to enhance procedural precision and reduce cognitive load.These findings underscore the potential of integrating advanced machine learning like YOLOv9 into robotic surgery, paving the way for intuitive, precise, and cognitively efficient hands-free systems. While the system’s design eliminates manual camera adjustments, potentially reducing cognitive load as supported by prior studies, future clinical trials are needed to quantify this benefit in live surgical settings.
关键词
相关论文
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger, P Lenz, R. Urtasun
2012
Vision meets robotics: The KITTI dataset
Andreas Geiger, Philip Lenz, Christoph Stiller 等 4 位作者
2013
A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses
R. Tsai
1987
Color indexing
Michael J. Swain, Dana H. Ballard
1991