Detection of Deepfake Environmental Audio
arxiv(2024)
摘要
With the ever-rising quality of deep generative models, it is increasingly
important to be able to discern whether the audio data at hand have been
recorded or synthesized. Although the detection of fake speech signals has been
studied extensively, this is not the case for the detection of fake
environmental audio.
We propose a simple and efficient pipeline for detecting fake environmental
sounds based on the CLAP audio embedding. We evaluate this detector using audio
data from the 2023 DCASE challenge task on Foley sound synthesis.
Our experiments show that fake sounds generated by 44 state-of-the-art
synthesizers can be detected on average with 98
an audio embedding learned on environmental audio is beneficial over a standard
VGGish one as it provides a 10
listening to Incorrect Negative examples demonstrates audible features of fake
sounds missed by the detector such as distortion and implausible background
noise.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要