首页 /研究 /To Select or not to Select, that is the Question: Distilling Robot Skill Prediction into a Small Ensemble
LOCOMOTION

To Select or not to Select, that is the Question: Distilling Robot Skill Prediction into a Small Ensemble

Haechan Mark Bong, Simon Roy, Euhid Aman, Giovanni Beltrame

发表年份
2026
访问权限
开放获取

摘要

As robot fleets become more heterogeneous, including humanoids, rovers, quadrupeds, and drones, selecting the right robot for a task becomes a core systems problem. We study robot skill prediction: mapping a natural-language task description to the physical capabilities required to execute it, such as fly, wheels, legs, surface water, under water and hands. Since labelled data that maps natural-language task descriptions to robot's physical capabilities does not exist, we construct a synthetic task-to-skill dataset using LLM-assisted generation and targeted label auditing. Trained on this data, a ~133M-parameter ensemble of two fine-tuned sentence encoders (mpnet + MiniLM) reaches 83.5% task-to-skill matching on a stratified 200 task dataset, outperforming Kimi K2 (1T MoE) at 72.0%, GPT-OSS-120B at 71.5%, and Llama-4-Scout-17B at 69.0% under the same zero-shot prompt. These results suggest that, for fixed robot skill taxonomies, small specialized models trained on synthetic data can outperform much larger general-purpose LLMs for fleet-level task routing.

关键词

cs.RO

相关论文

查看 LOCOMOTION 分类全部论文