Towards real-time Speech Emotion Recognition using deep neural networks

2015 9th International Conference on Signal Processing and Communication Systems (ICSPCS)(2015)

引用 47|浏览48
暂无评分
摘要
Most existing Speech Emotion Recognition (SER) systems rely on turn-wise processing, which aims at recognizing emotions from complete utterances and an overly-complicated pipeline marred by many preprocessing steps and hand-engineered features. To overcome both drawbacks, we propose a real-time SER system based on end-to-end deep learning. Namely, a Deep Neural Network (DNN) that recognizes emotions from a one second frame of raw speech spectrograms is presented and investigated. This is achievable due to a deep hierarchical architecture, data augmentation, and sensible regularization. Promising results are reported on two databases which are the eNTERFACE database and the Surrey Audio-Visual Expressed Emotion (SAVEE) database.
更多
查看译文
关键词
affective computing,deep neural networks,deep learning,emotion recognition,spectrograms,speech processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要