A Comprehensive Survey of Visual SLAM Technology: Methods, Challenges, and Perspectives
Aidos Ibrayev, Amanzhol Bektemessov
- Year
- 2025
- Citations
- 3
- Access
- Open access
Abstract
Visual Simultaneous Localization and Mapping (Visual SLAM) has become a cornerstone of autonomous navigation and spatial understanding in robotics, augmented reality, and computer vision. This review presents a comprehensive examination of algorithmic progress in Visual SLAM, focusing on the three principal paradigms: monocular, stereo, and RGB-D SLAM. Monocular SLAM, known for its minimal hardware requirements, has evolved from feature-based methods to deep learning-enhanced systems, addressing challenges like scale ambiguity and drift. Stereo SLAM leverages depth through triangulation, improving scale accuracy and robustness, particularly in dynamic and low-texture environments. RGB-D SLAM, utilizing depth-sensing technology, has enabled dense and semantically enriched mapping, finding significant application in indoor and real-time scenarios. Through a chronological and technical exploration of representative methods including RatSLAM, ORB-SLAM, DSO, ProSLAM, ElasticFusion, DynaSLAM, and recent hybrid and learning-based frameworks. This review identifies major milestones and architectural innovations across paradigms. A cross-paradigm analysis highlights the trade-offs in accuracy, computational efficiency, and adaptability, while also discussing emerging trends such as semantic integration, multimodal fusion, and neural implicit representations. Furthermore, the paper outlines future directions that include lifelong learning, real-time deployment on edge devices, dynamic environment adaptation, and the convergence of geometry and learning-based pipelines. Supported by a detailed taxonomy and historical evolution illustrated in visual summaries, this review serves as a foundational reference for researchers and developers aiming to understand and contribute to the advancement of Visual SLAM technologies in both academic and real-world contexts.
Keywords
Related papers
Artificial intelligence: a modern approach
1995
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger, P Lenz, R. Urtasun
2012
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martı́n Abadi, Ashish Agarwal, Paul Barham +17 more
2016
Vision meets robotics: The KITTI dataset
Andreas Geiger, Philip Lenz, Christoph Stiller +1 more
2013