Efficient Black-Box Speaker Verification Model Adaptation with Reprogramming and Backend Learning

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览4
暂无评分
摘要
The development of deep neural networks (DNN) has significantly enhanced the performance of speaker verification (SV) systems in recent years. However, a critical issue that persists when applying DNN-based SV systems in practical applications is domain mismatch. To mitigate the performance degradation caused by the mismatch, domain adaptation becomes necessary. This paper introduces an approach to adapt DNN-based SV models by manipulating the learnable model inputs, inspired by the concept of adversarial reprogramming. The pre-trained SV model remains fixed and functions solely in the forward process, resembling a black-box model. A lightweight network is utilized to estimate the gradients for the learnable parameters at the input, which bypasses the gradient backpropagation through the black-box model. The reprogrammed output is processed by a two-layer backend learning module as the final adapted speaker embedding. The number of parameters involved in the gradient calculation is small in our design. With few additional parameters, the proposed method achieves both memory and parameter efficiency. The experiments are conducted in language mismatch scenarios. Using much less computation cost, the proposed method obtains close or superior performance to the fully finetuned models in our experiments, which demonstrates its effectiveness.
更多
查看译文
关键词
Speaker verification,domain adaptation,reprogramming,black-box models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要