Emotion Recognition from Speech to Improve Human-Robot Interaction

Changrui Zhu, Wasim Ahmad

发表年份: 2019
引用次数: 9

摘要

Speech emotion recognition (SER) has become one of the significant approaches to improve human-robot interaction. In this paper, two methods are proposed which take into consideration the size of the databases along with other aspects of the models. The first model applied K nearest neighbors (KNN) algorithms with 1-30 Gammatone frequency cepstral coefficients (GTCCs) which is mainly proposed for relatively small databases. It achieved 95.3% overall recognition accuracy on Berlin Emotional Speech database (EMODB). The second model is mainly focused on relatively large databases, which adopted 1-30 GTCCs, delta 1-30 GTCCs, delta-delta 1- 30 GTCCs, spectral features and prosodic features as the feature set and used long short-term memory (LSTM) as the classifier. An overall accuracy of 87.5% is achieved with this model when applied to Chinese emotional speech database (CASIA).

关键词

Computer scienceHuman–robot interactionSpeech recognitionEmotion recognitionHuman–computer interactionRobotArtificial intelligenceNatural language processing

Emotion Recognition from Speech to Improve Human-Robot Interaction

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory