ARFF-VO: A Self-Supervised Monocular Visual Odometry With Adaptive Region-Based Feature Filtering in Dynamic Scenes
Guangdong Tong, Yuanyang Zhang, Zheng Li, Jiaru Sun
- 发表年份
- 2025
- 引用次数
- 2
摘要
Self-supervised monocular visual odometry provides a solution for robot localization and mapping without the need for labeled data by minimizing image reconstruction loss to train the network. However, existing methods explicitly remove dynamic objects by introducing semantic masks, which limits their adaptability to dynamic pixels. In this paper, we propose ARFF-VO, which integrates dynamic removal strategy into the network to enable the model to self-adaptively suppress redundant information. To fully exploit non-redundant features, we introduce Region-Structure Perception (R-SP) module that utilizes high-level semantic information to construct perception features and confidence scores. Additionally, we employ the Vim block, with selective state space models as its core operator, to build the pose decoder. The model effectively compresses contextual information to enhance long-sequence modeling capability. Furthermore, since monocular depth estimation and pose prediction are simultaneously trained, the performance improvement of visual odometry also positively impacts depth estimation. Evaluations on the KITTI dataset demonstrate that our method achieves superior performance compared to state-of-the-art self-supervised methods.
关键词
相关论文
Artificial intelligence: a modern approach
1995
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger, P Lenz, R. Urtasun
2012
Self-Organizing Maps
Teuvo Kohonen
1995
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martı́n Abadi, Ashish Agarwal, Paul Barham 等 20 位作者
2016