首页 /研究 /Integrating Large Language Models into Robotic Autonomy: A Review of Motion, Voice, and Training Pipelines
MANIPULATION

Integrating Large Language Models into Robotic Autonomy: A Review of Motion, Voice, and Training Pipelines

Yutong Liu, Qingquan Sun, Dhruvi Rajeshkumar Kapadia

发表年份
2025
引用次数
10
访问权限
开放获取

摘要

This survey provides a comprehensive review of the integration of large language models (LLMs) into autonomous robotic systems, organized around four key pillars: locomotion, navigation, manipulation, and voice-based interaction. We examine how LLMs enhance robotic autonomy by translating high-level natural language commands into low-level control signals, supporting semantic planning and enabling adaptive execution. Systems like SayTap improve gait stability through LLM-generated contact patterns, while TrustNavGPT achieves a 5.7% word error rate (WER) under noisy voice-guided conditions by modeling user uncertainty. Frameworks such as MapGPT, LLM-Planner, and 3D-LOTUS++ integrate multi-modal data—including vision, speech, and proprioception—for robust planning and real-time recovery. We also highlight the use of physics-informed neural networks (PINNs) to model object deformation and support precision in contact-rich manipulation tasks. To bridge the gap between simulation and real-world deployment, we synthesize best practices from benchmark datasets (e.g., RH20T, Open X-Embodiment) and training pipelines designed for one-shot imitation learning and cross-embodiment generalization. Additionally, we analyze deployment trade-offs across cloud, edge, and hybrid architectures, emphasizing latency, scalability, and privacy. The survey concludes with a multi-dimensional taxonomy and cross-domain synthesis, offering design insights and future directions for building intelligent, human-aligned robotic systems powered by LLMs.

关键词

Computer scienceHuman–computer interactionScalabilitySoftware deploymentArtificial intelligenceRobotSoftware engineering

相关论文

查看 MANIPULATION 分类全部论文