Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors
Hritam Basak, Hamid Tabatabaee, Shreekant Gayaka, Mingfeng Li, Xin Yang, Cheng-Hao Kuo, Arnie Sen, Min Sun, Zhaozheng Yin
- Year
- 2025
- Citations
- 1
Abstract
3D object generation from a single unposed RGB image is essential for robotic perception, as reconstructing complete geometry and texture is essential for precise manipulation, grasping, and scene understanding, which is key for autonomous navigation and dexterous interaction. Recent advancements in image-to-3D employ Gaussian Splatting with pre-trained 2D or 3D diffusion models, but a disparity exists: 2D models generate high-fidelity textures yet lack geometric consistency, while 3D models ensure structural coherence but produce overly smooth textures. To address this, we introduce a two-stage frequency-based distillation loss integrated with Gaussian Splatting, leveraging geometric priors from a 3D diffusion model’s low-frequency spectrum for structural consistency and a 2D diffusion model’s high-frequency details for sharper textures. Our approach achieves state-of-the-art 3D reconstruction quality, significantly improving robotic perception pipelines. Additionally, we demonstrate the easy adaptability of our method for highly accurate object pose estimation and tracking, which is critical for precise robotic grasping, manipulation, and scene understanding. Additional results can be found in the supplementary file.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Vision meets robotics: The KITTI dataset
Andreas Geiger, Philip Lenz, Christoph Stiller +1 more
2013
Real-Time Obstacle Avoidance for Manipulators and Mobile Robots
Oussama Khatib
1986
A Mathematical Introduction to Robotic Manipulation
Richard M. Murray, Zexiang Li, Shankar Sastry
2017