SPAN-Nav: Generalized Spatial Awareness for Versatile Vision-Language Navigation
Jiahang Liu, Tianyu Xu, Jiawei Chen, Lu Yue, Jiazhao Zhang, Zhiyong Wang, Minghan Li, Qisheng Zhao, Anqi Li, Qi Su, Zhizheng Zhang, He Wang
- Year
- 2026
- Access
- Open access
Abstract
Recent embodied navigation approaches leveraging Vision-Language Models (VLMs) demonstrate strong generalization in versatile Vision-Language Navigation (VLN). However, reliable path planning in complex environments remains challenging due to insufficient spatial awareness. In this work, we introduce SPAN-Nav, an end-to-end foundation model designed to infuse embodied navigation with universal 3D spatial awareness using RGB video streams. SPAN-Nav extracts spatial priors across diverse scenes through an occupancy prediction task on extensive indoor and outdoor environments. To mitigate the computational burden, we introduce a compact representation for spatial priors, finding that a single token is sufficient to encapsulate the coarse-grained cues essential for navigation tasks. Furthermore, inspired by the Chain-of-Thought (CoT) mechanism, SPAN-Nav utilizes this single spatial token to explicitly inject spatial cues into action reasoning through an end-to end framework. Leveraging multi-task co-training, SPAN-Nav captures task-adaptive cues from generalized spatial priors, enabling robust spatial awareness to generalize even to the task lacking explicit spatial supervision. To support comprehensive spatial learning, we present a massive dataset of 4.2 million occupancy annotations that covers both indoor and outdoor scenes across multi-type navigation tasks. SPAN-Nav achieves state-of-the-art performance across three benchmarks spanning diverse scenarios and varied navigation tasks. Finally, real-world experiments validate the robust generalization and practical reliability of our approach across complex physical scenarios.
Keywords
Related papers
A dual-loop framework for manufacturability-aware topology optimization of electric vehicle structures via wire arc additive manufacturing
Qiang Cui, Chuan Yu, Daoqian Yang +2 more
Robotics and Computer-Integrated Manufacturing · 2026
Geometric digital twin: A digital and intelligent model for aero-engine assembly accuracy prediction
Ke Shang, Xin Jin, Teli Xu +4 more
Robotics and Computer-Integrated Manufacturing · 2026
Revolutionizing Industries Through AI-Driven Robotics
Aryan Chaudhary
Recent Advances in Computer Science and Communications · 2026
Design and dynamic performance prediction of a novel large-aperture offset-feed deployable antenna
Chuang Shi, Tianming Liu, Ning Xue +6 more
Aerospace Science and Technology · 2026