Normal-To-Lombard Speech Conversion By Lstm Network And Bgmm For Intelligibility Enhancement Of Telephone Speech
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)(2020)
摘要
Noise in the environment significantly decreases the speech intelligibility of telephone conversations. Despite clean speech output from the device, the listener is still hard to get information. This study focuses on intelligibility enhancement (IENH) of telephone speech in near-end background noise based on normal-to-Lombard speech conversion. The proposed approach uses long short-term memory (LSTM) and Bayesian Gaussian mixture model (BGMM) to build the speech mapping model. Compared with previous studies, we fully consider the short-term correlations of speech and implement feature mappings with higher dimensional features and more types of features. Evaluations indicate that the proposed approach has achieved better results in both objective and subjective evaluation.
更多查看译文
关键词
intelligibility enhancement (IENH), telephone speech, background noise, Lombard speech, feature mapping
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络