Mobile ALOHA (Stanford) 是真正自主的吗?
Mobile ALOHA is a research platform developed at Stanford University (IRIS/REAL Lab) by Zipeng Fu, Tony Z. Zhao, and Chelsea Finn, funded by the Boston Dynamics AI Institute and ONR. It augments the stationary ALOHA bimanual teleoperation system with an AgileX Tracer mobile base and a whole-body teleoperation interface, costing approximately $32,000 for the full build. The system is designed primarily for data collection via human teleoperation, after which imitation learning (Action Chunking with Transformers, ACT) with co-training on existing static datasets enables autonomous task execution at up to 90% success-rate improvement across tasks like cooking shrimp, using elevators, and cabinet manipulation — all demonstrated in a controlled lab setting. The platform is fully open-source and is a research prototype, not a commercial product.
Mobile ALOHA operates in two distinct modes: (1) teleoperation for data collection, where a human physically performs tasks via the leader-follower interface, and (2) autonomous execution of trained tasks using imitation learning (ACT + co-training), where the robot performs the task without a human driving it. The autonomy verdict applies to mode (2), which is the system's stated research goal and is demonstrated in the official project results. The research paper, PMLR proceedings, and official project page all confirm autonomous task completion (cooking, elevator use, cabinet manipulation) after training on ~50 demonstrations. The teleoperation disclaimers on viral videos refer to data-collection mode, not the autonomous policy execution mode — the project website explicitly distinguishes these. Confidence is moderate (not high) because: autonomous results are demonstrated only in controlled lab conditions on specific trained tasks; generalization to novel environments or tasks is not established; and the system is a research prototype, not a deployed product. The human's role in teleoperation is for data collection (setup/training), not for performing the task during autonomous operation — this does not disqualify the Autonomous classification per the definitions provided.
宣称 vs 现实
Mobile ALOHA can autonomously complete complex mobile manipulation tasks such as cooking, cabinet use, and elevator operation after 50 demonstrations with up to 90% success rate improvement.
mobile-aloha.github.io ↗The widely-shared housekeeping video explicitly states the robot is teleoperated, not autonomous. A separate disclaimer on another video also confirms teleoperation.
youtube.com ↗Costs range from $27,000 to $35,000 depending on configuration and sourcing; one video reviewer calculated ~$27,000 from component prices.
youtube.com ↗Stanford University project (Fu, Zhao, Finn); Boston Dynamics AI Institute is a funder.
mobile-aloha.github.io ↗Some third-party sources (YouTube channels, news articles) attribute Mobile ALOHA to Google DeepMind or describe it as a Google DeepMind + Stanford collaboration.
youtube.com ↗AgileX Tracer differential-drive base (implied by paper citing Tracer specifically)
mobile-aloha.github.io ↗- 🏢Mobile ALOHA
- 🛒ALOHA Robot: Stanford's Open-Source Bimanual System — Setup, Cos
- 🛒Mobile Aloha: A Breakthrough in Humanoid Robotics Unleashes The
- 🛒[PDF] mobile-aloha.pdf
- 🛒Stanford’s mobile ALOHA robot learns from humans to cook, clean,
- 🛒Mobile ALOHA — Stanford Robotics Center
- 📰Stanford Mobile ALOHA 2 release and the open-hardware push | Cal
- 📰Mobile ALOHA: Learning Bimanual Mobile Manipulation using Low-Co
- 📰Stanford's Mobile ALOHA Robots Now Walk… | gentic.news
- 📰Mobile ALOHA Cost & Setup: Full BOM + Alternatives
- 📰Mobile ALOHA: Open Source Housekeeper Robot - Robo9
- 📰Training Data for Stanford SAIL, ALOHA & Octo | Claru
- 2024 · Mobile ALOHA: Learning Bimanual Mobile Manipulation with L
- 2023 · Learning Fine-Grained Bimanual Manipulation with Low-Cost
- 2025 · Whole-Body Teleoperation for Mobile Manipulation at Zero A
- 2025 · ALPHA- α and Bi-ACT Are All You Need: Importance of Positi
- 2023 · Learning Fine-Grained Bimanual Manipulation with Low-Cost
- 2024 · ALOHA 2: An Enhanced Low-Cost Hardware for Bimanual Teleop
- Haptic-enabled shared-control telemanipulation [grasping]
- Deep Reinforcement Learning for Vision-Based Robotic Manipulation [grasping]
- Learning from Demonstration for Robot Manipulation [grasping]