On the active perception of speech by robots
Harouna Kabré
- Year
- 2002
- Citations
- 9
Abstract
We describe an autonomous agent approach to automatic speech recognition which is based on the link of two models: a virtual environment model (VEM) and a virtual speaker model (VSM). The VEM is a system which can generate some synthetic signals of different wave lengths and can record real world data from a camera and a microphone. The VSM is a speech synthesis model with some controllable parameters which can be used to synthesize speech signal which varies according to the characteristics of an unknown speaker. VEM and VSM are instantiated to train artificial neural networks which extract and integrate the auditory and the visual information paths for the purpose of robust automatic speech recognition. Such an instance is called an autonomous speech recognition agent (ASRA) or equivalently a speech robot. Finally, the problem of robust automatic speech recognition in this new framework amounts to select the best ASRA for a given pair of VEM and VSM. The paper describes the simulation environment and presents the potential applications of this new model in the framework of data fusion, of ASRAs evaluation and of emerging properties of auto-adaptive systems.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002