首页 /研究 /A unified approach to speech production and recognition based on articulatory motor representations
LEARNING

A unified approach to speech production and recognition based on articulatory motor representations

Jonas Hörnstein, José Santos-Victor

发表年份
2007
引用次数
8

摘要

We present a unified approach for speech production and recognition based on articulatory motor representations. The approach is inspired by the motor theory and the discovery of mirror neurons, and use motor representations for both reproduction and recognition of speech. A model of the vocal tract is used to create sound and the created sound is then mapped back to the motor representation using a neural network. To learn the map we mimic the behavior of a child that uses a combination of babbling and interaction with its caregiver to learn how to speak. Several different phases of babbling and interaction are identified and described. These help to overcome the inversion problem. The approach has been implemented on a humanoid robot, which has successfully learned to pronounce Swedish and Portuguese vowels. We have also studied how the different phases of babbling and interaction effect the error of the map and the achieved recognition rate when presented with vowels from different subjects. Finally we compare the recognition rates obtained using motor space with recognition rates obtained by directly using the acoustic parameters.

关键词

BabblingComputer scienceSpeech recognitionSpeech productionVocal tractNeurocomputational speech processingRepresentation (politics)Articulation (sociology)Artificial intelligencePsychology

相关论文

查看 LEARNING 分类全部论文