Learning Surgical Robotic Manipulation with 3D Spatial Priors
Yu Sheng, Lidian Wang, Xiaomeng Chu, Jiajun Deng, Min Cheng, Yanyong Zhang, Bei Hua, Houqiang Li, Jianmin Ji
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Achieving 3D spatial awareness is crucial for surgical robotic manipulation, where precise and delicate operations are required. Existing methods either explicitly reconstruct the surgical scene prior to manipulation, or enhance multi-view features by adding wrist-mounted cameras to supplement the default stereo endoscopes. However, both paradigms suffer from notable limitations: the former easily leads to error accumulation and prevents end-to-end optimization due to its multi-stage nature, while the latter is rarely adopted in clinical practice since wrist-mounted cameras can interfere with the motion of surgical robot arms. In this work, we introduce the Spatial Surgical Transformer (SST), an end-to-end visuomotor policy that empowers surgical robots with 3D spatial awareness by directly exploring 3D spatial cues embedded in endoscopic images. First, we build Surgical3D, a large-scale photorealistic dataset containing 30K stereo endoscopic image pairs with accurate 3D geometry, addressing the scarcity of 3D data in surgical scenes. Based on Surgical3D, we finetune a powerful geometric transformer to extract robust 3D latent representations from stereo endoscopes images. These representations are then seamlessly aligned with the robot's action space via a lightweight multi-level spatial feature connector (MSFC), all within an endoscope-centric coordinate frame. Extensive real-robot experiments demonstrate that SST achieves state-of-the-art performance and strong spatial generalization on complex surgical tasks such as knot tying and ex-vivo organ dissection, representing a significant step toward practical clinical deployment. The dataset and code will be released.
关键词
相关论文
机器人技术在整形外科中的应用
Vijay Kumar, Sandhya Pandey
Clinical Journal of Plastic & Reconstructive Surgery · 2026
SurfSurg6D:面向无纹理手术器械的几何一致密集对应位姿估计
Daiyun Shen, Shuojue Yang, Chang Han Low 等 7 位作者
2026
EndoGSim:基于MLLM引导的高斯泼溅的物理感知4D动态内窥镜场景模拟
Changjing Liu, Yiming Huang, Long Bai 等 5 位作者
2026
腹膜后机器人辅助肾输尿管切除术:技术描述与单中心经验
Kawashima A, Ishizuya Y, Yamamoto Y 等 12 位作者
Asian journal of endoscopic surgery · 2026