首页 /研究 /Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories

OTHER

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories

Fábio Vital, Miguel Vasco, Alberto Sardinha, Francisco Melo

发表年份: 2022
访问权限: 开放获取

摘要

We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a sequence of instructions, to an adequate sequence of movements to be executed by a robot. In the first stage, we perceive and pre-process the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the multimodal latent values into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution in a robotic platform. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and readable handwritten words.

关键词

cs.ROcs.AIcs.LG

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories

摘要

关键词

相关论文

一种面向线弧增材制造的电动汽车结构可制造性拓扑优化的双环框架

几何数字孪生：一种用于航空发动机装配精度预测的数字智能模型

通过人工智能驱动的机器人技术革新产业

新型大口径偏置馈电可展开天线设计与动态性能预测