Deep Learning Based Emotion Classification Using Mel Frequency Magnitude Coefficient

2023 1st International Conference on Innovations in High Speed Communication and Signal Processing (IHCSP)(2023)

引用 2|浏览4
暂无评分
摘要
The popularity of emotion recognition using speech signals increases more and more because of its vast number of applications in the practical field. Emotion recognition using the speech signal is a very complicated and challenging task. This plays an essential role in enhancing human-computer interaction (HCI). Many authors used different methods to improve the accuracy of speech emotion recognition (SER). Proper selection of features and suitable machine and deep learning model design can improve the recognition rate. In this work, we used a modified version of the mel frequency cepstral coefficient (MFCC) feature named the mel frequency magnitude coefficient (MFMC) with convolutional neural network (CNN) and deep neural network (DNN) classifiers to enhance the SER. We used MFMC and MFCC features as input to CNN and DNN classifiers and evaluated the accuracy of SER. We made two observations from our experiment. First, the performance of the MFMC feature in SER is better than the MFCC feature for both classifiers. Second, the proposed DNN classifier achieved better accuracy than the CNN classifier for both features (MFMC and MFCC). The MFMC feature with the DNN classifier achieved an accuracy of 76.72%, 84.72%, 77.88%, and 100% for the RA VDESS, EMODB, SA VEE, and TESS datasets, respectively. Similarly, the CNN classifier with the MFMC feature achieved an accuracy of 72.9%, 82.41 %, 74.5%, and 100% for the same datasets. Our proposed work was compared with the state-of-the-art models, and we found that our model performed better than others.
更多
查看译文
关键词
Mel frequency cepstral coefficient,Mel frequency magnitude coefficient,Deep neural network,Convolutional neural network,Speech emotion recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要