Grounding Hierarchical Vision-Language-Action Models Through Explicit Language-Action Alignment
Theodor Wulff, Federico Tavella, Rahul Singh Maharjan, Manith Adikari, Angelo Cangelosi
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Achieving robot transparency is a critical step toward effective human-robot collaboration. To be transparent, a robot's natural language communication must be consistent with its actions and explicitly grounded in the task and environment. Existing hierarchical Vision-Language-Action (VLA) models can generate language (e.g., through chain-of-thought) and low-level actions. However, current work does not consider explicit alignment between these modalities during training. To address this crucial gap, we propose a novel training framework that explicitly grounds hierarchical VLA sub-task descriptions with respect to the visual observation and action space. Our framework uses a contrastive model to assess the alignment between generated language and corresponding action trajectories. This contrastive model enables direct ranking of different language-trajectory pairs based on their alignment, allowing us to refine the grounding of our hierarchical VLA through offline preference learning. We apply our framework to the LanguageTable dataset, a benchmark dataset of human language-annotated trajectories, and provide critical insights into multimodal grounding representations, all while establishing a strong baseline that achieves performance comparable to fully supervised fine-tuning and minimizing the need for costly data annotations.
关键词
相关论文
工业5.0中人机协作的多模态感知、互认知与具身执行综述与展望
Kai Ding, Qingyuan Mao, Yaqian Zhang 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
迈向以人为中心的制造:人机协作装配中不确定性下的任务规划
Yingchao You, Ze Ji, Changyun Wei
Robotics and Computer-Integrated Manufacturing · 2026
代理式人机协作:通过记忆实现上下文对齐
Jiahui Si, Wenchao Li, Xi Chen 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
自适应物理信息Transformer结合高斯过程残差补偿用于人机协作中的逆动力学建模
Rui Qian, Xi Zhang, Dongpeng Li 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026