An open source software system for robot audition HARK and its evaluation
Kazuhiro Nakadai, Hiroshi G. Okuno, Hirofumi Nakajima, Yuji Hasegawa, Hiroshi Tsujino
- Year
- 2008
- Citations
- 70
Abstract
Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called “HARK”, which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called “missing feature mask”, for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called “FlowDesigner” to share intermediate audio data, which provides real-time processing. HARK’s performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002