On the active perception of speech by robots

Harouna Kabré

Year: 2002
Citations: 9

Abstract

We describe an autonomous agent approach to automatic speech recognition which is based on the link of two models: a virtual environment model (VEM) and a virtual speaker model (VSM). The VEM is a system which can generate some synthetic signals of different wave lengths and can record real world data from a camera and a microphone. The VSM is a speech synthesis model with some controllable parameters which can be used to synthesize speech signal which varies according to the characteristics of an unknown speaker. VEM and VSM are instantiated to train artificial neural networks which extract and integrate the auditory and the visual information paths for the purpose of robust automatic speech recognition. Such an instance is called an autonomous speech recognition agent (ASRA) or equivalently a speech robot. Finally, the problem of robust automatic speech recognition in this new framework amounts to select the best ASRA for a given pair of VEM and VSM. The paper describes the simulation environment and presents the potential applications of this new model in the framework of data fusion, of ASRAs evaluation and of emerging properties of auto-adaptive systems.

Keywords

Computer scienceSpeech recognitionRobotMicrophoneSpeech processingArtificial neural networkPerceptionArtificial intelligence

On the active perception of speech by robots

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory