Home /Research /Smooth Turn-taking by a Robot Using an Online Continuous Model to Generate Turn-taking Cues
HRI

Smooth Turn-taking by a Robot Using an Online Continuous Model to Generate Turn-taking Cues

Divesh Lala, Koji Inoue, Tatsuya Kawahara

Year
2019
Citations
42

Abstract

Turn-taking in human-robot interaction is a crucial part of spoken dialogue systems, but current models do not allow for human-like turn-taking speed seen in natural conversation. In this work we propose combining two independent prediction models. A continuous model predicts the upcoming end of the turn in order to generate gaze aversion and fillers as turn-taking cues. This prediction is done while the user is speaking, so turn-taking can be done with little silence between turns, or even overlap. Once a speech recognition result has been received at a later time, a second model uses the lexical information to decide if or when the turn should actually be taken. We constructed the continuous model using the speaker’s prosodic features as inputs and evaluated its online performance. We then conducted a subjective experiment in which we implemented our model in an android robot and asked participants to compare it to one without turn-taking cues, which produces a response when a speech recognition result is received. We found that using both gaze aversion and a filler was preferred when the continuous model correctly predicted the upcoming end of turn, while using only gaze aversion was better if the prediction was wrong.

Keywords

Turn-takingGazeComputer scienceConversationRobotHuman–robot interactionArtificial intelligenceSpeech recognitionTurn (biochemistry)Human–computer interaction

Related papers

Browse all HRI papers