Home /Research /Deep Multimodal Imitation Learning-Based Framework for Robot-Assisted Medical Examination
MANIPULATION

Deep Multimodal Imitation Learning-Based Framework for Robot-Assisted Medical Examination

Weiyong Si, Ning Wang, Rebecca Harris, Chenguang Yang

Year
2025
Citations
2

Abstract

Medical ultrasound examination is a challenging dexterous manipulation task for robots. Even for experienced sonographers, it involves real-time decision-making, motion control, and force regulation based on ultrasound images and patient feedback. In this article, we propose a unified framework for robot-assisted medical examination, specifically for the initial registration in artery scanning, by leveraging deep multimodal imitation learning, compliant control, and trajectory optimization. To process multimodal inputs during the initial registration phase, we investigate a deep imitation learning model that fuses RGB and ultrasound images, contact force, and proprioceptive data. The deep imitation learning model predicts the desired motion and contact force. We design a compliant controller in Cartesian space to track the desired trajectory and force. To smooth the trajectory and ensure safety, we employ a trajectory optimization planner between the deep imitation learning module and the low-level compliant controller. The generalization capability of the deep multimodal imitation learning module, control performance, and the quality of the acquired ultrasound images on both the Phantom and human subjects were evaluated. Experimental results show that the proposed approach significantly improves the success rate of autonomous ultrasound scanning from 75% to 90%, while also reducing the completion time.

Keywords

Computer scienceArtificial intelligenceImitationRobotHuman–computer interactionMedical roboticsDeep learningMachine learningPsychology

Related papers

Browse all MANIPULATION papers