Holistic Surgical Phase Recognition with Hierarchical Input Dependent State Space Models
Haoyang Wu, Tsun-Hsuan Wang, Mathias Lechner, Ramin Hasani, Jennifer A. Eckhoff, Paul Pak, Ozanan R. Meireles, Guy Rosman, Yutong Ban, Daniela Rus
- Year
- 2025
- Access
- Open access
Abstract
Surgical workflow analysis is essential in robot-assisted surgeries, yet the long duration of such procedures poses significant challenges for comprehensive video analysis. Recent approaches have predominantly relied on transformer models; however, their quadratic attention mechanism restricts efficient processing of lengthy surgical videos. In this paper, we propose a novel hierarchical input-dependent state space model that leverages the linear scaling property of state space models to enable decision making on full-length videos while capturing both local and global dynamics. Our framework incorporates a temporally consistent visual feature extractor, which appends a state space model head to a visual feature extractor to propagate temporal information. The proposed model consists of two key modules: a local-aggregation state space model block that effectively captures intricate local dynamics, and a global-relation state space model block that models temporal dependencies across the entire video. The model is trained using a hybrid discrete-continuous supervision strategy, where both signals of discrete phase labels and continuous phase progresses are propagated through the network. Experiments have shown that our method outperforms the current state-of-the-art methods by a large margin (+2.8% on Cholec80, +4.3% on MICCAI2016, and +12.9% on Heichole datasets). Code will be publicly available after paper acceptance.
Keywords
Related papers
Robotics in Plastic Surgery
Vijay Kumar, Sandhya Pandey
Clinical Journal of Plastic & Reconstructive Surgery · 2026
SurfSurg6D: Geometry Consistent Dense Correspondence for Textureless Surgical Instrument Pose Estimation
Daiyun Shen, Shuojue Yang, Chang Han Low +4 more
2026
EndoGSim: Physics-Aware 4D Dynamic Endoscopic Scene Simulations via MLLM-Guided Gaussian Splatting
Changjing Liu, Yiming Huang, Long Bai +2 more
2026
Retroperitoneal Robot-Assisted Nephroureterectomy: Technical Description and Single Center Experience.
Kawashima A, Ishizuya Y, Yamamoto Y +9 more
Asian journal of endoscopic surgery · 2026