Fusion Of Classifier Predictions For Audio-Visual Emotion Recognition

Fatemeh Noroozi,Marina Marjanovic,Angelina Njegus,Sergio Escalera,Gholamreza Anbarjafari

2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)（2016）

引用 29|浏览49

暂无评分

摘要

In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set of key-frames, which are learnt in order to visually discriminate emotions by means of a Convolutional Neural Network. Finally, confidence outputs of all classifiers from all modalities are used to define a new feature space to be learnt for final emotion prediction, in a late fusion/stacking fashion. The conducted experiments on eNTERFACE'05 database show significant performance improvements of our proposed system in comparison to state-of-the-art approaches.

查看译文

关键词

classifier prediction fusion,audio-visual emotion recognition,multimodal emotion recognition system,MFCC-based feature extraction,audio channel,facial landmark geometric relations,visual data,emotion video,key-frames,convolutional neural network

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要