Human Tide, Clear Sight: Semantically Enhanced Visual Localization in High-Crowd Scenarios
Yida Wei, Sikang Liu, Zixuan Huang, Wei He, You Li
- Year
- 2025
- Citations
- 1
Abstract
Accurate visual localization is essential in IoT applications, particularly for robotics, autonomous systems, and augmented reality. Traditional feature-based methods struggle with efficiency and robustness against environmental variations. To enhance the robustness of visual localization algorithms against these variations, state-of-the-art (SOTA) methods have incorporated semantic information as an advanced dimension into their models, but still suffer from several shortcomings. These methods often embed semantic information implicitly, which limits their extensibility and interpretability. Moreover, the introduction of some unstable semantic labels may, on the contrary, degrade the localization accuracy. Therefore, modularity, quantization, and filtering semantic labels by their stability become critical. To address these gaps, this article proposes a method that explicitly and quantitatively integrates semantic information through a plug-and-play module. This module scores image-to-image and feature-to-feature correspondences based on semantic similarity and stability, with a particular focus on improving smartphone-based visual localization in high-crowd indoor scenarios. This module is introduced into two key stages of visual hierarchical localization: 1) visual place recognition (coarse localization) and 2) 6-Degree-of-Freedom pose estimation (fine localization). Specifically, correspondences with low scores imply a higher probability of matching errors and are therefore suppressed. To validate the proposed approach, a novel dataset designed for semantic visual localization tasks is collected, rich with dynamic objects and scene variations. The method demonstrates superior accuracy and robustness, particularly in environments with significant scene appearance changes, with 13.6% and 5.4% improvement in localization accuracy in Cafds and Libds datasets, respectively, compared to the SOTA approach. The code and dataset are available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/1da1da/SEVL</uri>.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991