Home /Research /Computational auditory scene analysis and its application to robot audition
HRI

Computational auditory scene analysis and its application to robot audition

Hiroshi G. Okuno, Tetsuya Ogata, Kazunori Komatani, Kazuhiro Nakadai

Year
2004
Citations
31

Abstract

Robot capability of hearing sounds, in particular, a mixture of sounds, by its own microphones, that is, robot audition, is important in improving human robot interaction. This paper presents the robot audition open-source software, called “HARK ” (HRI-JP Audition for Robots with Kyoto University), which consists of primitive functions in computational auditory scene analysis; sound source localization, separation, and recognition of separated sounds. Since separated sounds suffer from spectral distortion due to separation, the HARK generates a time-spectral map of reliability, called “missing feature mask”, for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. The HARK is implemented on the middleware called “FlowDesigner ” to share intermediate audio data, which enables near real-time processing. Index Terms — robot audition, computational auditory scene analysis, Missing feature theory, simultaneous speakers 1.

Keywords

Computer scienceRobotSpeech recognitionComputational auditory scene analysisInterface (matter)Filter (signal processing)Transformation (genetics)Auditory scene analysisSymbol (formal)Computer vision

Related papers

Browse all HRI papers