Video-Based Emotion Recognition Using Aggregated Features And Spatio-Temporal Information
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)(2018)
摘要
In this paper, we present a video-based emotion recognition system in the wild which consists of four pipeline modules: image-processing, deep feature extraction, feature aggregation and emotion classification. Our method focuses more on different feature descriptors. To obtain high-level features which are more discriminative in emotion recognition, we employ an aggregation of features extracted from different deep convolutional neural networks (CNNs). Furthermore, the long short-term memory network (LSTM) and 3D convolutional networks (C3D) are utilized to extract spatio-temporal features from videos in order to combine the spatial information and temporal information. Additionally, we evaluate our method on the 5th Emotion Recognition in the Wild Challenge in the category of video-based emotion recognition and the result shows our proposed system achieves better performance.
更多查看译文
关键词
spatio-temporal features,aggregated features,spatio-temporal information,video-based emotion recognition system,deep feature extraction,feature aggregation,emotion classification,3D convolutional networks,deep convolutional neural networks,long short-term memory network,feature descriptors,LSTM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要