VLN-ChEnv: Vision-language Navigation in Changeable Environments

謙二高橋, Qi Wu, Peng Wang

发表年份: 2025
引用次数: 1

摘要

Commanding robots to do chores using natural language instructions has been a dream of us for a long time. The navigation capability, as one of the key foundational abilities to achieve this goal, has garnered significant attention in this regard. When human users instruct intelligent agent, the instructions they given sometimes exhibit slight discrepancies from navigable ones, as user's understanding of scene may not be up-to-date due to instant change of environments. This paper investigates 3 common scenarios where instructions and navigation scenes are imperfectly aligned: change of navigability, incorrect landmark references, and incorrect direction descriptions. We then propose an ImperfectVLN task and dataset for evaluating an agent's navigation performance under instruction and environment imperfectly matched conditions. Evaluation results indicate significant performance fluctuations in existing state-of-the-art models under modification scenarios including referred landmark removal and original path blockages. We also provide a series of result analyses and further insights. We aim for this new dataset to become a valuable benchmark, enhancing practical VLN tasks. We further design a reflection module based on our insights, allowing an agent to review its history and identify potential errors. Experiments show that this module improves the performance on ImperfectVLN by 4.4%.

关键词

LandmarkTask (project management)Key (lock)RobotPath (computing)Reflection (computer programming)Task analysis

VLN-ChEnv: Vision-language Navigation in Changeable Environments

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Fractional Differential Equations

Applied Nonlinear Control