LEARNING
基于势引导的流匹配用于视觉-语言-动作策略改进
Yunpeng Mei, Jiakai He, Hongjie Cao, Chenyu Wang, Xiaowen Zhu, Yihan Zhou, Jiamin Wang, Chenbo Xin, Peng Cheng, Yuxuan Yang, Yijie Wang, Xinhu Zheng, Gao Huang, Jie Chen, Gang Wang
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
本文提出ForesightFlow,一种自引导的流匹配策略,通过解耦优势加权流匹配方法,在不依赖外部评论家的情况下实现动作块的最佳K推理。该方法解决了策略改进与价值校准之间的监督冲突,显著提升了视觉-语言-动作策略的部署性能。
关键词
flow matchingvision-language-actionpolicy improvementadvantage weightingbest-of-K inference
相关论文
LEARNING
📊 8,465 引用
The Organization of Behavior
D. O. Hebb
2005
LEARNING
📊 7,678 引用
Fractional Brownian Motions, Fractional Noises and Applications
Benoît B. Mandelbrot, John W. Van Ness
1968
LEARNING
开放获取📊 7,484 引用
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
Laith Alzubaidi, Jinglan Zhang, Amjad J. Humaidi 等 10 位作者
2021
LEARNING
📊 4,608 引用
A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar 等 10 位作者
2018