Home /Breakthroughs /Google DeepMind's Gemini Robotics: Vision-Language-Action Models for Humanoid Robots

🧪 Technology⚡· Impact 7/10

Google DeepMind's Gemini Robotics: Vision-Language-Action Models for Humanoid Robots

Source: deepmind.google · Jun 11, 2026

Summary

Google DeepMind introduces Gemini Robotics, a family of vision-language-action models that enable humanoid robots to perceive, reason, and perform complex multi-step tasks with generalization across different robot embodiments. This represents a significant step toward general-purpose robotics, integrating advanced AI reasoning with physical action.

Related research

All papers →

Recent papers matching 4 keywords from this headline

PERCEPTION

Open access📊 9,777 cites

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Martı́n Abadi, Ashish Agarwal, Paul Barham +17 more

2016

📄 PDF Details →

LEARNING

📊 7,678 cites

Fractional Brownian Motions, Fractional Noises and Applications

Benoît B. Mandelbrot, John W. Van Ness

1968

Details →

OTHER

Open access📊 2,593 cites

Lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics

Vitali Matyash, Gerhard Liebisch, Teymuras V. Kurzchalia +2 more

2008

📄 PDF Details →

OTHER

📊 2,583 cites

Applied Mechanics and Materials

2026

Details →