Reliability Estimation for Synthetic Speech Detection

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 1|浏览6
暂无评分
摘要
Recent advances in speech synthesis and counterfeit audio generation have pushed the multimedia forensics community to develop speech deepfake detection techniques to avoid threats and unpleasant situations. Although synthetic speech detectors show excellent performance in controlled conditions, they are not always reliable in open set cases, when evaluated on data that are very different from those seen during training. This can lead to misleading scores and poorly indicative results in real-world scenarios. In this paper, we propose a method for estimating the reliability of a prediction performed by a speech deepfake detector. This enables us to perform the detection only on the most relevant portions of a signal, i.e., the time windows on which we obtain more reliable scores. This increases the final accuracy of the developed systems. As some audio fragments may not contain enough traces for the task at hand and negatively affect the system output, a reliability estimator allows us to discard them and focus only on the most pertinent data. The proposed method proves to positively impact the performance of the considered detector and shows excellent generalization capabilities on unseen datasets.
更多
查看译文
关键词
Audio Forensics,Speech,Deepfake,Reliability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要