DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览19
暂无评分
摘要
Self-supervised learning has been successfully applied to various speech recognition and understanding tasks. However, for generative tasks such as speech enhancement and speech separation, most self-supervised speech representations did not show substantial improvements. To deal with this problem, in this paper, we propose data2vec-SG (Speech Generation), which is a teacher-student learning framework that addresses speech generation tasks. Our data2vec-SG introduces a reconstruction module into data2vec [1] and enforces the representations to contain not only the semantic information but also the acoustic knowledge to generate clean speech waveforms. Experimental results demonstrate that the proposed framework boosts the performance of various speech generation tasks including speech enhancement, speech separation, and packet loss concealment. Meanwhile, the learned representation is also capable of helping other downstream tasks, which is demonstrated by the good performance in the speech recognition task in both clean and noisy conditions.
更多
查看译文
关键词
self-supervised learning,speech enhancement,speech separation,packet loss concealment,automatic speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要