Casa-Based Speaker Identification Using Cascaded Gmm-Cnn Classifier In Noisy And Emotional Talking Conditions

APPLIED SOFT COMPUTING(2021)

引用 37|浏览3
暂无评分
摘要
This work aims at intensifying text-independent speaker identification performance in real application situations such as noisy and emotional talking conditions. This is achieved by incorporating two different modules: a Computational Auditory Scene Analysis (CASA) based pre-processing module for noise reduction and "cascaded Gaussian Mixture Model - Convolutional Neural Network (GMM-CNN) classifier for speaker identification'' followed by emotion recognition. This research proposes and evaluates a novel algorithm to improve the accuracy of speaker identification in emotional and highly-noise susceptible conditions. Experiments demonstrate that the proposed model yields promising results in comparison with other classifiers when "Speech Under Simulated and Actual Stress (SUSAS) database, Emirati Speech Database (ESD), the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)'' database and the "Fluent Speech Commands'' database are used in a noisy environment. (C) 2021 Elsevier B.V. All rights reserved.
更多
查看译文
关键词
COmputational Auditory Scene Analysis (CASA), Convolutional Neural Network, Gaussian mixture model, Speaker identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要