R-Vectors: New Technique for Adaptation to Room Acoustics

INTERSPEECH(2019)

引用 12|浏览24
暂无评分
摘要
Distant speech recognition is an important problem which is far from being solved. Reverberation and noise are in the list of main problems in this area. The most popular methods of dealing with them are data augmentation and speech enhancement. In this paper, we propose a novel approach, inspired by modern methods of speaker adaptation. First of all, a feed-forward network is trained to classify room impulse responses (RIRs) from speech recordings. Then this network is used for extracting embeddings, which we call R-vectors. These R-vectors are appended to input features of the acoustic model. Due to the lack of labeled data for RIRs classification task, we propose a self-supervised method of training the network, which consists of using artificial audio generated by room simulator. Experimental evaluation was conducted on VOiCES19 and AMI single-channel tasks as well as CHiME5 multi-channel task. It is shown that the R-vector-adapted ASR systems achieve up to 14% relative WER reduction. Furthermore, it is additive with gains from state-of-the-art dereverberation (WPE) and speaker adaptation (x-vector) techniques.
更多
查看译文
关键词
R-vectors, distant ASR, room acoustics adaptation, VOiCES19 Challenge, CHiME5 challenge, AMI
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要