A Corrective Learning Approach For Text-Independent Speaker Verification

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2018)

引用 24|浏览68
暂无评分
摘要
We present a conceptually plausible approach for text-independent speaker verification (TISV) which treats speech recordings as a collection of segments providing incremental evidence. This approach, called corrective learning, gradually improves an initial prediction of speaker identity based on incoming speech and the latest prediction. Specifically, we propose deep corrective learning networks (CLNets) that explicitly learn a mapping from a new speech segment and the current predictions, to a correction. Intuitively, the predictions eventually converge to the ground truth after several corrections. Trained on NIST SRE datasets, CLNets outperform current CNN and the i-vector baselines. Moreover, CLNets and i-vectors are complementary, and their fusion leads to significant performance improvements compared to what can be achieved by each of them individually.
更多
查看译文
关键词
Speaker verification, deep corrective learning networks, universal background model, i-vectors
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要