Home /Research /Emotion Recognition from Speech to Improve Human-Robot Interaction

HRI

Emotion Recognition from Speech to Improve Human-Robot Interaction

Changrui Zhu, Wasim Ahmad

Year: 2019
Citations: 9

Abstract

Speech emotion recognition (SER) has become one of the significant approaches to improve human-robot interaction. In this paper, two methods are proposed which take into consideration the size of the databases along with other aspects of the models. The first model applied K nearest neighbors (KNN) algorithms with 1-30 Gammatone frequency cepstral coefficients (GTCCs) which is mainly proposed for relatively small databases. It achieved 95.3% overall recognition accuracy on Berlin Emotional Speech database (EMODB). The second model is mainly focused on relatively large databases, which adopted 1-30 GTCCs, delta 1-30 GTCCs, delta-delta 1- 30 GTCCs, spectral features and prosodic features as the feature set and used long short-term memory (LSTM) as the classifier. An overall accuracy of 87.5% is achieved with this model when applied to Chinese emotional speech database (CASIA).

Keywords

Computer scienceHuman–robot interactionSpeech recognitionEmotion recognitionHuman–computer interactionRobotArtificial intelligenceNatural language processing

Emotion Recognition from Speech to Improve Human-Robot Interaction

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory