Home /Research /A unified approach to speech production and recognition based on articulatory motor representations
LEARNING

A unified approach to speech production and recognition based on articulatory motor representations

Jonas Hörnstein, José Santos-Victor

Year
2007
Citations
8

Abstract

We present a unified approach for speech production and recognition based on articulatory motor representations. The approach is inspired by the motor theory and the discovery of mirror neurons, and use motor representations for both reproduction and recognition of speech. A model of the vocal tract is used to create sound and the created sound is then mapped back to the motor representation using a neural network. To learn the map we mimic the behavior of a child that uses a combination of babbling and interaction with its caregiver to learn how to speak. Several different phases of babbling and interaction are identified and described. These help to overcome the inversion problem. The approach has been implemented on a humanoid robot, which has successfully learned to pronounce Swedish and Portuguese vowels. We have also studied how the different phases of babbling and interaction effect the error of the map and the achieved recognition rate when presented with vowels from different subjects. Finally we compare the recognition rates obtained using motor space with recognition rates obtained by directly using the acoustic parameters.

Keywords

BabblingComputer scienceSpeech recognitionSpeech productionVocal tractNeurocomputational speech processingRepresentation (politics)Articulation (sociology)Artificial intelligencePsychology

Related papers

Browse all LEARNING papers