Aligning Few-Step Generative Models by Amortizing Sample-based Variational Inference
Jaewoo Lee, Hyeongyu Kang, Dohyun Kim, Kyuil Sim, Woocheol Shin, Minsu Kim, Taeyoung Yun, Jeongjae Lee, Sanghyeok Choi, Tabitha Edith Lee, Jongchul Ye, Jinkyoo Park
2026
Abstract
Aligning a few-step generative model is challenging, since existing alignment frameworks typically rely on restrictive assumptions: a tractable likelihood, a specific ODE/SDE solver, or a particular model family. We introduce FAV, Few-step Generative Models Alignment via Sample-based Variational Inference, a general alignment framework that requires only sample access to the generator and the reference distribution. We cast alignment as sampling from a reward-tilted distribution anchored to a reference distribution. We leverage Stein Variational Gradient Descent as a sample-based variational inference scheme and amortize its particle updates into the generator parameters via fixed-point regression. We evaluate FAV on two domains: robotics manipulation and image generator alignment. On generative policy alignment for robotic manipulation, FAV outperforms prevailing policy extraction baselines across 56 offline and 30 offline-to-online RL tasks. For image generator alignment, FAV fine-tunes diverse few-step backbones, including GAN, drifting model, consistency models, and flow maps, scaling from ImageNet-$256$ to 1024$^2$ text-to-image synthesis. Code is available at https://github.com/Jaewoopudding/FAV.
Keywords
Related papers
Parallel Differentiable Reachability for Learning and Planning with Certified Neural Dynamics and Controllers
Keyi Shen, Glen Chou
2026
Trust Region Q Adjoint Matching
Yonghoon Dong, Kyungmin Lee, Changyeon Kim +2 more
2026
Manipulating Tangible Virtual Object Dynamics to Promote Learning of Precision Force Generation
Alberto Garzás-Villar, Alba Riera-Cardona, Alexis Derumigny +3 more
2026
Efficient On-policy Visual-RL via Stochastic Decoupled Policy Gradient
Haoxiang You, Yilang Liu, Davis Zong +5 more
2026