Bag-of-words representation for non-intrusive speech quality assessment

2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING(2015)

引用 27|浏览21
暂无评分
摘要
Research on non-intrusive speech quality assessment (SQA) aims to develop a computational model simulating the human perception of speech signals accurately and automatically without any prior information about the reference clean speech signals. In this paper, we propose to learn a non-intrusive SQA metric based on bag-of-words (BoW) representation of speech signals. In particular, the proposed method treats the whole speech utterance as a text document and extracts perceptual linear prediction (PLP) features of local segments as words. The speech utterance is then represented as a histogram of codewords, with each entry as the probability of a codeword appeared in the utterance. After the BoW representation of speech signals is obtained, support vector regression (SVR) is used to learn the metric for quality evaluation. Experimental results demonstrate that the proposed non-intrusive SQA metric BoW can obtain better performance than relevant state-of-the-art metrics.
更多
查看译文
关键词
bag of words, codebook construction, speech quality, non-intrusive quality assessment, support vector regression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要