Home /Research /Large Language Model‐Embedded Intelligent Robotic Scrub Nurse with Multimodal Input for Enhancing Surgeon–Robot Interaction

MANIPULATION

Large Language Model‐Embedded Intelligent Robotic Scrub Nurse with Multimodal Input for Enhancing Surgeon–Robot Interaction

Wing Yin Ng, Wanyu Ma, Pheng‐Ann Heng, Philip Wai Yan Chiu, Zheng Li

Year: 2025
Citations: 1
Access: Open access

Abstract

Scrub nurses have crucial responsibilities, particularly in handling instrument‐related tasks. However, significant mental burdens and unfamiliarity with instruments can lead to various human errors. Consequently, the research community has explored robotic prototypes. Unfortunately, these prototypes often focus on specific instrument‐handling tasks or offer non‐intuitive interaction methods, hindering social acceptance. This article proposes a surgeon‐friendly robotic scrub nurse platform that addresses multiple instrument‐related tasks, including grasping and transferring, automatic sorting, and counting. To the best of the authors’ knowledge, this is the first prototype to incorporate audiovisual input modalities and a large language model (LLM) for smooth and intuitive interaction between the surgeon and the robot. Specifically, vision artificial intelligence (AI) provides accurate instrument detection results using oriented bounding boxes with an average precision of 97.6%, guiding robot motion planning. The speech AI recognizes the surgeon's voice commands. The LLM further interprets multimodal information to trigger different robot actions via the “tool use” capability, achieving a standalone success rate of 94% with an average action latency of less than 1 s on a real robotic scrub nurse hardware platform. Physical validation demonstrated that the proposed prototype successfully completed all assigned tasks, proving its feasibility and effectiveness.

Keywords

Human–computer interactionRobotComputer scienceHuman–robot interactionArtificial intelligence

Large Language Model‐Embedded Intelligent Robotic Scrub Nurse with Multimodal Input for Enhancing Surgeon–Robot Interaction

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory