MLP-SVNET: A Multi-Layer Perceptrons Based Network for Speaker Verification

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)(2022)

引用 13|浏览21
暂无评分
摘要
Convolution and self-attention based neural networks have both obtained excellent performance in automatic speaker verification. However, the convolution model often lacks the ability of long-term dependency modeling due to the limitation of receptive field, while the self-attention model is insufficient to model local information. To tackle this limitation, we propose a new multi-layer perceptrons based speaker verification network (MLP-SVNet) which can apply MLPs across temporal and frequency dimensions to capture the local and global information at the same time. The experimental results conducted on Voxceleb show that the proposed model is very competitive when compared to other systems based on convolution or self-attention. In addition, we demonstrate that MLP-SVNet based on multi-layer perceptrons can produce complementary embeddings, which can be fused with the state-of-the-art system to further improve the performance.
更多
查看译文
关键词
Multi-layer Perceptron,Speaker Verification,Speaker Embedding,Text-independent
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要