Speech Emotion Recognition: Performance Analysis based on Fused Algorithms and GMM Modelling

Indian journal of science and technology(2016)

引用 14|浏览0
暂无评分
摘要
Background/Objectives: Speech emotion recognition (SER) is an important aspect of Human-Computer Interaction systems which is widely used in different sectors like healthcare, robotics, automatic call centres and distance education. Speech emotion recognition involves in depth analysis of the signal and identifying the appropriate emotion based on its trained database using extracted features. Method/Statistical Analysis: This paper aims in devising SER system using linear prediction of the causal part of the autocorrelation sequence (OSALPC) algorithm which has been proven to efficiently reduce noise along with Linear Frequency Cepstral Coefficients (LFCC), Linear Predictive Coding (LPC), MFCC, LPC using cepstrum for feature extraction. After extracting the feature vectors from the voice signal, it is modelled using Gaussian Mixture Models (GMM). The MAP (Maximum a posteriori) rule is used for decision making. Findings: Performance was analysed and our proposed system showed an overall efficiency of 89% when tested on German database (Emo-DB) for 7 emotions. The overall efficiency has proven to increase compared to the studies made up to date on the German Database. The highest emotion recognition rate was for SAD using fused algorithm which was 95.56%. Also results were tabulated and compared using Modified MFCC. A Graphical unit interface of the proposed system is also devised. Application/Improvements: The applications of speech emotion recognition are farfetched. Further scope of this work will be a comparison of the achieved recognition rate using algorithms with recognition rate achieved by humans.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要