Deep Learning Based Speech Quality Assessment Focusing on Noise Effects

Rahul Jaiswal, Anu Priya

SPEECH AND COMPUTER, SPECOM 2023, PT II(2023)

引用 0|浏览0
暂无评分
摘要
This paper investigates the suitability of different speech features in measuring and monitoring speech quality in order to fulfil the expected level of human perceived quality of experience (QoE) while using applications, such as Microsoft Skype, and Apple Face-Time to name a few. To this end, two speech features, namely; line spectral frequencies (LSF), and multi-resolution auditory model (MRAM) are extracted from the speech signal after processing it through a voice activity detector (VAD). A series of deep neural network (DNN)-based objective no-reference speech quality models (SQMs) are then developed employing a single speech feature and combining both speech features. Two noisy speech datasets, namely; Supplement-23 and NOIZEUS-2240 are used for the experiment. Simulation results demonstrate that the SQM developed using combined speech features results in a better speech quality prediction as compared to the SQM developed using a single speech feature, when tested with distinct types of speech degradations.
更多
查看译文
关键词
DNN,QoE,Speech Feature,Speech Quality,VAD
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要