DriveVA: Video Action Models are Zero-Shot Drivers
Mengmeng Liu, Diankun Zhang, Jiuming Liu, Jianfeng Cui, Hongwei Xie, Guang Chen, Hangjun Ye, Michael Ying Yang, Francesco Nex, Hao Cheng
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Generalization is a central challenge in autonomous driving, as real-world deployment requires robust performance under unseen scenarios, sensor domains, and environmental conditions. Recent world-model-based planning methods have shown strong capabilities in scene understanding and multi-modal future prediction, yet their generalization across datasets and sensor configurations remains limited. In addition, their loosely coupled planning paradigm often leads to poor video-trajectory consistency during visual imagination. To overcome these limitations, we propose DriveVA, a novel autonomous driving world model that jointly decodes future visual forecasts and action sequences in a shared latent generative process. DriveVA inherits rich priors on motion dynamics and physical plausibility from well-pretrained large-scale video generation models to capture continuous spatiotemporal evolution and causal interaction patterns. To this end, DriveVA employs a DiT-based decoder to jointly predict future action sequences (trajectories) and videos, enabling tighter alignment between planning and scene evolution. We also introduce a video continuation strategy to strengthen long-duration rollout consistency. DriveVA achieves an impressive closed-loop performance of 90.9 PDM score on the challenge NAVSIM. Extensive experiments also demonstrate the zero-shot capability and cross-domain generalization of DriveVA, which reduces average L2 error and collision rate by 78.9% and 83.3% on nuScenes and 52.5% and 52.4% on the Bench2drive built on CARLA v2 compared with the state-of-the-art world-model-based planner.
关键词
相关论文
一种面向线弧增材制造的电动汽车结构可制造性拓扑优化的双环框架
Qiang Cui, Chuan Yu, Daoqian Yang 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026
几何数字孪生:一种用于航空发动机装配精度预测的数字智能模型
Ke Shang, Xin Jin, Teli Xu 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
通过人工智能驱动的机器人技术革新产业
Aryan Chaudhary
Recent Advances in Computer Science and Communications · 2026
新型大口径偏置馈电可展开天线设计与动态性能预测
Chuang Shi, Tianming Liu, Ning Xue 等 9 位作者
Aerospace Science and Technology · 2026