On the active perception of speech by robots

Harouna Kabré

发表年份: 2002
引用次数: 9

摘要

We describe an autonomous agent approach to automatic speech recognition which is based on the link of two models: a virtual environment model (VEM) and a virtual speaker model (VSM). The VEM is a system which can generate some synthetic signals of different wave lengths and can record real world data from a camera and a microphone. The VSM is a speech synthesis model with some controllable parameters which can be used to synthesize speech signal which varies according to the characteristics of an unknown speaker. VEM and VSM are instantiated to train artificial neural networks which extract and integrate the auditory and the visual information paths for the purpose of robust automatic speech recognition. Such an instance is called an autonomous speech recognition agent (ASRA) or equivalently a speech robot. Finally, the problem of robust automatic speech recognition in this new framework amounts to select the best ASRA for a given pair of VEM and VSM. The paper describes the simulation environment and presents the potential applications of this new model in the framework of data fusion, of ASRAs evaluation and of emerging properties of auto-adaptive systems.

关键词

Computer scienceSpeech recognitionRobotMicrophoneSpeech processingArtificial neural networkPerceptionArtificial intelligence

On the active perception of speech by robots

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory