首页 /研究 /Learning and Verification of Names with Multimodal User ID in Dialog
PERCEPTION

Learning and Verification of Names with Multimodal User ID in Dialog

Hartwig Holzapfel, Alex Waibel

发表年份
2008
引用次数
3

摘要

Acquiring new knowledge is a key functionality for humanoid robots. By envisioning a robot that can provide personalized services the system needs to detect, recognize and memorize information about specific persons. Recent work al- ready shows promising results in the area of speech recognition, voice identification and face identification that enable a system to reliably detect and recognize persons, as well as approaches to interactively learn to know new persons in dialog acquiring their names and ID information. One problem in this area is verification, namely to detect which person is known versus which person is unknown; a second problem is the learning phase, namely to learn the name of a person and store it in a database with associated face and voice classifier information. This paper presents work to interactively acquire ID information, combining both of the above problems into one learning dialog. In dialog we combine multimodal input including spoken name recognition, name pronunciation (phoneme recognition), name spelling (grapheme representation), face identification and voice identification and seek to build dialogs optimized to verify or learn a person's name and ID. For designing and training of optimized dialogs we use a reinforcement learning approach and propose a mul- timodal simulation modeling the user's actions and multimodal ID recognition components including stochastic error models. I. INTRODUCTION In this paper we present work on learning names and person ID information in a multimodal dialog system for a humanoid robot. One part of the dialogs that can be con- ducted with the robot are dialogs to identify and especially to learn to know new persons. We have conducted experiments with a receptionist scenario, where one task of the robot receptionist was to identify the visiting person or learn the name of the person if unknown. In the following we present efforts on especially this task namely isolated identification dialogs within the receptionist scenario. These dialogs fulfill two purposes: In case the person is known, confirm the name of the person. In case the person is unknown, classify the person as unknown and conduct a learning dialog to obtain the person's name. The presented experiments make use of standard per- ceptual components available on a humanoid robot. These components are visual perception with a stereo camera and acoustic perception with distant and close-talk microphones. Visual perception provides face detection and identification. Acoustic perception provides voice identification and speech recognition including name recognition, spelling and pho- netic understanding. These components provide recognition hypotheses which are interpreted by the dialog manager. The challenge of this task is to define a dialog strategy, including when to confirm ID information, when to ask for name pronunciation or spelling. With the goal of optimizing dialogs regarding success, length, and subjective measures, we have implemented a reinforcement learning approach which combines both verification and learning into one dia- log integrating the multiple input modalities presented above. For achieving this goal, we implemented a first rule based dialog strategy, and later a reinforcement learning strategy, which was trained in a multimodal user simulation. In the following we present the setup for multimodal integration in dialog, definition of the handcrafted strategy and learning of dialog strategies in the multimodal user simulation. Both dialog strategies are evaluated within the simulation and are compared against each other. First results from a real user experiment are reported.

关键词

Computer scienceDialog boxArtificial intelligenceSpeech recognitionNatural language processingIdentification (biology)Classifier (UML)Dialog systemHumanoid robotPronunciation

相关论文

查看 PERCEPTION 分类全部论文