Home /Breakthroughs /Google DeepMind's Gemini Robotics: Vision-Language-Action Models for Humanoid Robots
🧪 Technology⚡· Impact 7/10
Google DeepMind's Gemini Robotics: Vision-Language-Action Models for Humanoid Robots
Source: deepmind.google · Jun 11, 2026
Summary
Google DeepMind introduces Gemini Robotics, a family of vision-language-action models that enable humanoid robots to perceive, reason, and perform complex multi-step tasks with generalization across different robot embodiments. This represents a significant step toward general-purpose robotics, integrating advanced AI reasoning with physical action.
Related research
All papers →Recent papers matching 4 keywords from this headline
PERCEPTION
Open access📊 9,777 cites
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martı́n Abadi, Ashish Agarwal, Paul Barham +17 more
2016
LEARNING
📊 7,678 cites
Fractional Brownian Motions, Fractional Noises and Applications
Benoît B. Mandelbrot, John W. Van Ness
1968
OTHER
Open access📊 2,593 cites
Lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics
Vitali Matyash, Gerhard Liebisch, Teymuras V. Kurzchalia +2 more
2008
OTHER
📊 2,583 cites
Applied Mechanics and Materials
2026