End-To-End Detection Of Attacks To Automatic Speaker Recognizers With Time-Attentive Light Convolutional Neural Networks

2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP)(2019)

引用 8|浏览43
暂无评分
摘要
In this contribution, we introduce convolutional neural network architectures aiming at performing end-to-end detection of attacks to voice biometrics systems, i.e. the model provides scores corresponding to the likelihood of attack given general purpose time-frequency features obtained from speech. Microphone level attackers based on speech synthesis and voice conversion techniques are considered, along with presentation replay attacks. While the convolutional models yield a sequence of representations corresponding to different parts of the input at varying time steps, concatenated first and second-order statistics pooled from the outputs of a self-attention layer are used as a fixed-dimension representations of utterances of varying length, which are then input into a set of fully connected layers to finally yield scores. Evaluation of the proposed framework is performed with data from ASVspoof 2019 challenge yielding relative improvements higher than one order of magnitude in terms of equal error rate over two baseline systems provided by ASVspoof 2019's organizers, and significant improvements over the benchmark systems we evaluated.
更多
查看译文
关键词
Voice biometrics,Presentation attacks detection,Speaker verification,Convolutional neural networks.
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要