Boosting the Performance of SpEx plus by Attention and Contextual Mechanism
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP)(2022)
摘要
Target speaker extraction (TSE) aims to mimic human selective attention to extracting our interested voice from the multi-talker environment. Time-domain methods represented by SpEx+ [1] have promoted the process of TSE tasks while residual noise, squeaks, and over-suppression still exist in the extracted speech. In this paper, we explore three ways to improve the performance of SpEx+, referring to two attention-based weight learning mechanisms on disparate dimensions to generate typical features and the context mechanism to refine the extracted masks. Experiments on both single-channel and multi-channel signals preliminarily demonstrated the effectiveness of our explored methods on SpEx+, especially on speech quality and alleviating squeaks, unexpected noises, and over-suppression.
更多查看译文
关键词
Target speaker extraction, SpEx, attention, contextual mechanism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要