首页 /研究 /Speaker localization among multi-faces in noisy environment by audio-visual integration
HRI

Speaker localization among multi-faces in noisy environment by audio-visual integration

Hyun-Don Kim, JongSuk Choi, Munsang Kim

发表年份
2006
引用次数
14

摘要

In this paper, we not only developed a reliable sound localization system including VAD (voice activity detection) component using three microphones but also a face tracking system using a vision camera. Moreover, we proposed a way to integrate these systems in the human-robot interaction to compensate the errors in the localization of a speaker and to reject unnecessary speech or noise signals entering from the undesired directions effectively. For the purpose of verifying our system's performances, we installed the proposed audition and vision system to the prototype robot, called IROBAA (Intelligent ROBot for Active Audition), and showed how to integrate an audio-visual system

关键词

Computer scienceComputer visionArtificial intelligenceRobotComponent (thermodynamics)Face (sociological concept)Noise (video)Acoustic source localizationAudio visualSpeech recognition

相关论文

查看 HRI 分类全部论文