首页 /研究 /TMNet: Transformer-fused multimodal framework for emotion recognition via EEG and speech
HRI

TMNet: Transformer-fused multimodal framework for emotion recognition via EEG and speech

Md. Muntasir Ul Alam, Mohamed Abubakar Dini, Dongseon Kim, Taesoo Jun

发表年份
2025
引用次数
12

摘要

In the evolving field of emotion recognition, which intersects psychology, human–computer interaction, and social robotics, there is a growing demand for more advanced and accurate frameworks. The traditional reliance on single-modal approaches has given way to a focus on multimodal emotion recognition, which offers enhanced performance by integrating multiple data sources. This paper introduces TMNet, an innovative multimodal emotion recognition framework that leverages both speech and Electroencephalography (EEG) signals to deliver superior accuracy. This framework utilizes cutting-edge technology, employing a Transformer model to effectively fuse the CNN-BiLSTM and BiGRU architectures, creating a unified multimodal representation for enhanced emotion recognition performance. By utilizing a diverse set of datasets RAVDESS, SAVEE, TESS, and CREMA-D for speech, along with EEG signals captured via the Muse headband. The multimodal model achieves impressive accuracies of 98.89% for speech and EEG signal processing.

关键词

ElectroencephalographyTransformerSpeech recognitionComputer scienceEmotion recognitionPsychologyEngineeringNeuroscienceElectrical engineeringVoltage

相关论文

查看 HRI 分类全部论文