Machine learning approach of speech emotions recognition using feature fusion technique

MULTIMEDIA TOOLS AND APPLICATIONS(2024)

引用 0|浏览7
暂无评分
摘要
In advancement of machine learning aspect, speech based emotional states identification must have a profound impact on artificial intelligence. Proper feature selection performs a vital role on such emotion recognition. Therefore, feature fusion technology has been proposed in this study for obtaining high prediction accuracy by prioritizing the extraction of sole features. Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Coefficients (LPC), energy, Zero Crossing Rate (ZCR) and pitch are extracted and four different models are constructed for experimenting the impact of feature fusion techniques on four standard machine learning classifier namely Support Vector Machine (SVM), Linear Discriminative Analysis (LDA), Decision-Tree (D-Tree) and K Nearest Neighbour (KNN). Successful application of feature fusion techniques on our proposed classifiers give satisfactory recognition rate 96.90% on the Bengali (Indian Regional language) based dataset SUST Bangla Emotional Speech Corpus (SUBESCO), 99.82% on Toronto Emotional Speech Set (TESS) (English), 95% on Ryerson Audio-Visual Database for Emotional Speech and Song (RAVDEES) (English) and 95.33% on Berlin Database of Emotional Speech (EMO-DB) (Berlin) dataset. The presented model indicates that the proper fusion of features has a positive impact on emotion detection systems by increasing their accuracy and applicability.
更多
查看译文
关键词
Feature fusion,Speech emotion recognition (SER),Mel frequency cepstral coefficients,Linear predictive coefficients,Support vector machine,Linear discriminative analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要