Home /Research /VLN-ChEnv: Vision-language Navigation in Changeable Environments
OTHER

VLN-ChEnv: Vision-language Navigation in Changeable Environments

謙二 高橋, Qi Wu, Peng Wang

Year
2025
Citations
1

Abstract

Commanding robots to do chores using natural language instructions has been a dream of us for a long time. The navigation capability, as one of the key foundational abilities to achieve this goal, has garnered significant attention in this regard. When human users instruct intelligent agent, the instructions they given sometimes exhibit slight discrepancies from navigable ones, as user's understanding of scene may not be up-to-date due to instant change of environments. This paper investigates 3 common scenarios where instructions and navigation scenes are imperfectly aligned: change of navigability, incorrect landmark references, and incorrect direction descriptions. We then propose an ImperfectVLN task and dataset for evaluating an agent's navigation performance under instruction and environment imperfectly matched conditions. Evaluation results indicate significant performance fluctuations in existing state-of-the-art models under modification scenarios including referred landmark removal and original path blockages. We also provide a series of result analyses and further insights. We aim for this new dataset to become a valuable benchmark, enhancing practical VLN tasks. We further design a reflection module based on our insights, allowing an agent to review its history and identify potential errors. Experiments show that this module improves the performance on ImperfectVLN by 4.4%.

Keywords

LandmarkTask (project management)Key (lock)RobotPath (computing)Reflection (computer programming)Task analysis

Related papers

Browse all OTHER papers