NavFormer: A Transformer Architecture for Robot Target-Driven Navigation in Unknown and Dynamic Environments
Haitong Wang, Aaron Hao Tan, Goldie Nejat
- Year
- 2024
- Citations
- 33
Abstract
In unknown cluttered and dynamic environments such as disaster scenes, mobile robots need to perform target-driven navigation in order to find people or objects of interest, where the only information provided about these targets are images of the individual targets. In this letter, we introduce NavFormer, a novel end-to-end transformer architecture developed for robot target-driven navigation in unknown and dynamic environments. NavFormer leverages the strengths of both 1) transformers for sequential data processing and 2) self-supervised learning (SSL) for visual representation to reason about spatial layouts and to perform collision-avoidance in dynamic settings. The architecture uniquely combines dual-visual encoders consisting of a static encoder for extracting invariant environment features for spatial reasoning, and a general encoder for dynamic obstacle avoidance. The primary robot navigation task is decomposed into two sub-tasks for training: single robot exploration and multi-robot collision avoidance. We perform cross-task training to enable the transfer of learned skills to the complex primary navigation task. Simulated experiments demonstrate that NavFormer can effectively navigate a mobile robot in diverse unknown environments, outperforming existing state-of-the-art methods. A comprehensive ablation study is performed to evaluate the impact of the main design choices of NavFormer. Furthermore, real-world experiments validate the generalizability of NavFormer.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002