LVID-SLAM: A Lightweight Visual-Inertial SLAM for Dynamic Scenes Based on Semantic Information
Xu Zhang, Wei Li, Ying Wang, Enhui Zheng
- Year
- 2025
- Citations
- 5
- Access
- Open access
Abstract
Simultaneous Localization and Mapping (SLAM) remains challenging in dynamic environments. Recent approaches combining deep learning with algorithms for dynamic scenes comprise two types: faster, less accurate object detection-based methods and highly accurate, computationally costly instance segmentation-based methods. In addition, maps lacking semantic information hinder robots from understanding their environment and performing complex tasks. This paper presents a lightweight visual-inertial SLAM system. The system is based on the classic ORB-SLAM3 framework, which starts a new thread for object detection and tightly couples the semantic information of object detection with geometric information to remove feature points from dynamic objects. In addition, Inertial Measurement Unit (IMU) data are employed to assist in feature point extraction, thereby compensating for visual pose tracking loss. Finally, a dense octree-based semantic map is constructed by fusing semantic information and visualized using ROS. LVID-SLAM demonstrates excellent pose accuracy and robustness in highly dynamic scenes on the public TUM dataset, with an average ATE reduction of more than 80% compared to ORB-SLAM3. The experimental results demonstrate that LVID-SLAM outperforms other methods in dynamic conditions, offering both real-time capability and robustness.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002