Delving into the Local: Dynamic Inconsistency Learning for DeepFake Video Detection.

AAAI Conference on Artificial Intelligence(2022)

引用 46|浏览112
暂无评分
摘要
The rapid development of facial manipulation techniques has aroused public concerns in recent years. Existing deepfake video detection approaches attempt to capture the discrim- inative features between real and fake faces based on tem- poral modelling. However, these works impose supervisions on sparsely sampled video frames but overlook the local mo- tions among adjacent frames, which instead encode rich in- consistency information that can serve as an efficient indica- tor for DeepFake video detection. To mitigate this issue, we delves into the local motion and propose a novel sampling unit named snippet which contains a few successive videos frames for local temporal inconsistency learning. Moreover, we elaborately design an Intra-Snippet Inconsistency Module (Intra-SIM) and an Inter-Snippet Interaction Module (Inter- SIM) to establish a dynamic inconsistency modelling frame- work. Specifically, the Intra-SIM applies bi-directional tem- poral difference operations and a learnable convolution ker- nel to mine the short-term motions within each snippet. The Inter-SIM is then devised to promote the cross-snippet infor- mation interaction to form global representations. The Intra- SIM and Inter-SIM work in an alternate manner and can be plugged into existing 2D CNNs. Our method outperforms the state of the art competitors on four popular benchmark dataset, i.e., FaceForensics++, Celeb-DF, DFDC and Wild- Deepfake. Besides, extensive experiments and visualizations are also presented to further illustrate its effectiveness.
更多
查看译文
关键词
Computer Vision (CV)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要