Simulation to Rules: A Dual-VLM Framework for Formal Visual Planning
Yilun Hao, Yongchao Chen, Chuchu Fan, Yang Zhang
- 发表年份
- 2025
- 访问权限
- 开放获取
摘要
Vision Language Models (VLMs) show strong potential for visual planning but struggle with precise spatial and long-horizon reasoning, while Planning Domain Definition Language (PDDL) planners excel at formal long-horizon planning but cannot interpret visual inputs. Recent works combine these complementary advantages by translating visual problems into PDDL. However, while VLMs can generate PDDL problem files satisfactorily, accurately generating PDDL domain files, which encode planning rules, remains challenging and typically requires human expertise or environment interaction. We propose VLMFP, a Dual-VLM-guided framework that autonomously generates both PDDL problem and domain files for formal visual planning. VLMFP combines a SimVLM that simulates action consequences with a GenVLM that generates and iteratively refines PDDL files by aligning symbolic execution with simulated outcomes, enabling multiple levels of generalization across unseen instances, visual appearances, and game rules. We evaluate VLMFP on 6 grid-world domains and demonstrate its generalization capability. On average, SimVLM achieves 87.3% and 86.0% scenario understanding and action simulation for seen and unseen appearances, respectively. With the guidance of SimVLM, VLMFP attains 70.0%, 54.1% planning success on unseen instances in seen and unseen appearances, respectively. We further demonstrate that VLMFP scales to complex long-horizon 3D planning tasks, including multi-robot collaboration and assembly scenarios with partial observability and diverse visual variations. Project page: https://sites.google.com/view/vlmfp.
关键词
相关论文
基于嵌入式语言模型的多机器人系统动态重构
Shokhikha Amalana Murdivien, Jongsu Park, Jumyung Um
Robotics and Computer-Integrated Manufacturing · 2026
基于大语言模型增强的多智能体强化学习的无人机博弈分层决策
Xinyu Dong, Bo Li, Guangyu Zhang 等 5 位作者
Aerospace Science and Technology · 2026
水下残骸区域多UUV协同覆盖搜索的编队优化与避碰决策方法
Haomiao Yu, Zeyuan Zhang, Yantian Ma
Robotics and Autonomous Systems · 2026
人在回路中的群体机器人:一种用于真实土壤测绘的仿生群体方法
Petras Swissler, Mohammadali Rashidioun, Nicholas Sahu 等 6 位作者
2026