Home /Research /Improving voice detection in real life scenarios: differentiating television and human speech at older adults’ houses

LEARNING

Improving voice detection in real life scenarios: differentiating television and human speech at older adults’ houses

David Figueroa, Shuichi Nishio, Ryuji Yamazaki, Hiroshi Ishiguro

Year: 2022
Citations: 2
Access: Open access

Abstract

The use of voice-operated robots in real-life settings introduces multiple issues as opposed to the use of them in controlled, laboratory conditions. In our study, we introduced conversation robots in the homes of 18 older adults’ homes to increase the conversation activities of the participants. A manual examination of the audio data the robot considered a human voice showed that a considerable amount was from television sounds present in the participants’ homes. We used this data to train a neural network that can differentiate between human speech and speech-like sounds from television, achieving high metrics. We extended our analysis into how the voices of the participants contain inherent patterns that can be general or uncommon and how this affects performance of our algorithm in our attempts to identify human speech with or without these patterns.

Keywords

ConversationComputer scienceSpeech recognitionHuman voiceRobotVoice activity detectionHuman–computer interactionPsychologySpeech processingCommunication

Improving voice detection in real life scenarios: differentiating television and human speech at older adults’ houses

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory