Detection of Deepfake Environmental Audio

Hafsa Ouajdi, Oussama Hadder, Modan Tailleur,Mathieu Lagrange,Laurie M. Heller

arxiv(2024)

引用 0|浏览0
暂无评分
摘要
With the ever-rising quality of deep generative models, it is increasingly important to be able to discern whether the audio data at hand have been recorded or synthesized. Although the detection of fake speech signals has been studied extensively, this is not the case for the detection of fake environmental audio. We propose a simple and efficient pipeline for detecting fake environmental sounds based on the CLAP audio embedding. We evaluate this detector using audio data from the 2023 DCASE challenge task on Foley sound synthesis. Our experiments show that fake sounds generated by 44 state-of-the-art synthesizers can be detected on average with 98 an audio embedding learned on environmental audio is beneficial over a standard VGGish one as it provides a 10 listening to Incorrect Negative examples demonstrates audible features of fake sounds missed by the detector such as distortion and implausible background noise.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要