Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models
CoRR(2024)
摘要
Neural networks have been successfully used for non-intrusive speech
intelligibility prediction. Recently, the use of feature representations
sourced from intermediate layers of pre-trained self-supervised and
weakly-supervised models has been found to be particularly useful for this
task. This work combines the use of Whisper ASR decoder layer representations
as neural network input features with an exemplar-based, psychologically
motivated model of human memory to predict human intelligibility ratings for
hearing-aid users. Substantial performance improvement over an established
intrusive HASPI baseline system is found, including on enhancement systems and
listeners unseen in the training data, with a root mean squared error of 25.3
compared with the baseline of 28.7.
更多查看译文
关键词
speech recognition,intelligibility prediction,hearing impairment
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要