Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement

Zhongliang Guo, Jie Du,Chin‐Hui Lee,Yu Gao,Wenbin Zhang

arXiv (Cornell University)(2023)

引用 0|浏览4
暂无评分
摘要
The goal of this study is to implement diffusion models for speech enhancement (SE). The first step is to emphasize the theoretical foundation of variance-preserving (VP)-based interpolation diffusion under continuous conditions. Subsequently, we present a more concise framework that encapsulates both the VP- and variance-exploding (VE)-based interpolation diffusion methods. We demonstrate that these two methods are special cases of the proposed framework. Additionally, we provide a practical example of VP-based interpolation diffusion for the SE task. To improve performance and ease model training, we analyze the common difficulties encountered in diffusion models and suggest amenable hyper-parameters. Finally, we evaluate our model against several methods using a public benchmark to showcase the effectiveness of our approach
更多
查看译文
关键词
interpolation diffusion models,enhancement,speech
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要