Dual-Path Dilated Convolutional Recurrent Network with Group Attention for Multi-Channel Speech Enhancement

Jiaming Cheng, Cong Pang,Ruiyu Liang, Jingjie Fan,Li Zhao

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览3
暂无评分
摘要
This paper proposes a dual-path convolutional recurrent network with group attention for ICASSP Signal Processing Grand Challenge: L3DAS23 Challenge. We design a structure based on convolutional encoder-decoder, and frequency-time blocks based on group attention are introduced in the middle. The encoder is used to extract the local representation from the complex spectrum, the correlation along the frequency axis and the time axis are captured through groups of time-frequency processing modules and the key information in the feature flow is extracted by the group attention. As a result, our system ranks the 1st place of the 3D speech enhancement task in L3DAS23 Challenge, and significantly outperforms the baseline, while achieving 0.101 WER and 0.902 STOI on the blind test-set.
更多
查看译文
关键词
speech enhancement,multi-channel speech enhancement,deep learning,frequency-time block,group attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要