Large Language Model‐Embedded Intelligent Robotic Scrub Nurse with Multimodal Input for Enhancing Surgeon–Robot Interaction
Wing Yin Ng, Wanyu Ma, Pheng‐Ann Heng, Philip Wai Yan Chiu, Zheng Li
- Year
- 2025
- Citations
- 1
- Access
- Open access
Abstract
Scrub nurses have crucial responsibilities, particularly in handling instrument‐related tasks. However, significant mental burdens and unfamiliarity with instruments can lead to various human errors. Consequently, the research community has explored robotic prototypes. Unfortunately, these prototypes often focus on specific instrument‐handling tasks or offer non‐intuitive interaction methods, hindering social acceptance. This article proposes a surgeon‐friendly robotic scrub nurse platform that addresses multiple instrument‐related tasks, including grasping and transferring, automatic sorting, and counting. To the best of the authors’ knowledge, this is the first prototype to incorporate audiovisual input modalities and a large language model (LLM) for smooth and intuitive interaction between the surgeon and the robot. Specifically, vision artificial intelligence (AI) provides accurate instrument detection results using oriented bounding boxes with an average precision of 97.6%, guiding robot motion planning. The speech AI recognizes the surgeon's voice commands. The LLM further interprets multimodal information to trigger different robot actions via the “tool use” capability, achieving a standalone success rate of 94% with an average action latency of less than 1 s on a real robotic scrub nurse hardware platform. Physical validation demonstrated that the proposed prototype successfully completed all assigned tasks, proving its feasibility and effectiveness.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002