World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems
Runze Li, Hongyin Zhang, Junxi Jin, Qixin Zeng, Zifeng Zhuang, Yiqi Tang, Shangke Lyu, Donglin Wang
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Vision-Language-Action (VLA) models have emerged as a promising paradigm for building embodied agents that ground perception and language into action. However, most existing approaches rely on direct action prediction, lacking the ability to reason over long-horizon trajectories and evaluate their consequences, which limits performance in complex decision-making tasks. In this work, we introduce World-Value-Action (WAV) model, a unified framework that enables implicit planning in VLA systems. Rather than performing explicit trajectory optimization, WAV model learn a structured latent representation of future trajectories conditioned on visual observations and language instructions. A learned world model predicts future states, while a trajectory value function evaluates their long-horizon utility. Action generation is then formulated as inference in this latent space, where the model progressively concentrates probability mass on high-value and dynamically feasible trajectories. We provide a theoretical perspective showing that planning directly in action space suffers from an exponential decay in the probability of feasible trajectories as the horizon increases. In contrast, latent-space inference reshapes the search distribution toward feasible regions, enabling efficient long-horizon decision making. Extensive simulations and real-world experiments demonstrate that the WAV model consistently outperforms state-of-the-art methods, achieving significant improvements in task success rate, generalization ability, and robustness, especially in long-horizon and compositional scenarios. Code is available at https://github.com/Win-commit/WAV.
关键词
相关论文
如何缓解越野环境中语义分割的分布偏移
Ji-Hoon Hwang, Daeyoung Kim, Hyung-Suk Yoon 等 5 位作者
2026
基于原型模糊推理与证据融合的不确定性引导工业机器人可进化识别框架
Yanrun Zhou, Zihao Lei, Guangrui Wen 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于点云配准的非破坏性高分辨率涂层厚度三维扫描测量
Simon Duenser, Ivo Aschwanden, Raamadaas Krishnadas 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026
迈向智能机器人时代:用于高级感知系统的多模态柔性触觉传感器
Sili Ding, Feng Xu, Jie Chen 等 6 位作者
Progress in Materials Science · 2026