首页 /研究 /Multimodal Gesture Recognition Using 3-D Convolution and Convolutional LSTM
LEARNING

Multimodal Gesture Recognition Using 3-D Convolution and Convolutional LSTM

Guangming Zhu, Liang Zhang, Peiyi Shen, Juan Song

发表年份
2017
引用次数
277

摘要

Gesture recognition aims to recognize meaningful movements of human bodies, and is of utmost importance in intelligent human-computer/robot interactions. In this paper, we present a multimodal gesture recognition method based on 3-D convolution and convolutional long-short-term-memory (LSTM) networks. The proposed method first learns short-term spatiotemporal features of gestures through the 3-D convolutional neural network, and then learns long-term spatiotemporal features by convolutional LSTM networks based on the extracted short-term spatiotemporal features. In addition, fine-tuning among multimodal data is evaluated, and we find that it can be considered as an optional skill to prevent overfitting when no pre-trained models exist. The proposed method is verified on the ChaLearn LAP large-scale isolated gesture data set (IsoGD) and the Sheffield Kinect gesture (SKIG) data set. The results show that our proposed method can obtain the state-of-the-art recognition accuracy (51.02% on the validation set of IsoGD and 98.89% on SKIG).

关键词

Computer scienceConvolutional neural networkGestureOverfittingArtificial intelligenceGesture recognitionConvolution (computer science)Set (abstract data type)Data setHidden Markov model

相关论文

查看 LEARNING 分类全部论文