Stacked deep convolutional auto-encoders for emotion recognition from facial expressions
Ariel Ruiz-Garcia, Mark Elshaw, Abdulrahman Altahhan, Vasile Palade
- Year
- 2017
- Citations
- 54
Abstract
Emotion recognition is critical for everyday living and is essential for meaningful interaction. If we are to progress towards human and machine interaction that is engaging the human user, the machine should be able to recognize the emotional state of the user. Deep Convolutional Neural Networks (CNN) have proven to be efficient in emotion recognition problems. The good degree of performance achieved by these classifiers can be attributed to their ability to self-learn a down-sampled feature vector that retains spatial information through filter kernels in convolutional layers. Given the view that random initialization of weights can lead to convergence to non-optimal local minima, in this paper we explore the impact of training the initial weights in an unsupervised manner. We study the effect of pre-training a Deep CNN as a Stacked Convolutional Auto-Encoder (SCAE) in a greedy layer-wise unsupervised fashion for emotion recognition using facial expression images. When trained with randomly initialized weights, our CNN emotion recognition model achieves a performance rate of 91.16% on the Karolinska Directed Emotional Faces (KDEF) dataset. In contrast, when each layer of the model, including the hidden layer, is pre-trained as an Auto-Encoder, the performance increases to 92.52%. Pre-training our CNN as a SCAE also reduces training time marginally. The emotion recognition model developed in this work will form the basis of a real-time empathic robot system.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002