Speech Emotion Recognition Based on Image Enhancement
ISKE(2019)
摘要
The performance of an emotion recognition system is determined by the quality of emotional features. In this paper, we propose a feature optimization algorithm based on image enhancement and present a convolutional recurrent model to realize emotional recognition of natural speech. For three-dimensional (3-D) log-Mel spectrum and 3-D spectrogram features, the fast gamma transformation with an adaptive threshold is adopted for feature enhancement to make full use of the dynamic characteristics of non-stationary speech signals. Meanwhile, the model combining Convolutional Neural Network (CNN) with the rectangular kernels and Long Short-Term Memory (LSTM) is used to complete speech emotion recognition tasks. Experiments are carried out on two public emotional datasets, and results demonstrate the good generalization ability and recognition performance of our proposed model.
更多查看译文
关键词
speech emotion recognition,CNN,LSTM,features enhancement,rectangular kernels
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络