Cross-Domain Diffusion Based Speech Enhancement for Very Noisy Speech

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览1
暂无评分
摘要
Deep learning based speech enhancement has achieved remarkable success, but challenges remain in low signal-to-noise ratio (SNR) nonstationary noise scenarios. In this study, we propose to incorporate diffusion-based learning into an enhancement model and improve robustness in extremely noisy conditions. Specifically, a frequency-domain diffusion-based generative module is employed, and it accepts the enhanced signal obtained from a time-domain supervised enhancement module as an auxiliary input to learn to recover clean speech spectrograms. Experimental results on the TIMIT dataset demonstrate the advantage of this approach and show better enhancement performance over other strong baselines in both -5 and -10 dB SNR noisy conditions.
更多
查看译文
关键词
speech enhancement,diffusion model,generative model,low signal-to-noise ratio
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要