Learning Spatiotemporal Inconsistency Via Thumbnail Layout for Face Deepfake Detection

Yuting Xu,Jian Liang,Lijun Sheng,Xiao-Yu Zhang

International Journal of Computer Vision（2024）

Chinese Academy of Sciences

Cited 0|Views37

Abstract

The deepfake threats to society and cybersecurity have provoked significantpublic apprehension, driving intensified efforts within the realm of deepfakevideo detection. Current video-level methods are mostly based on 3D CNNsresulting in high computational demands, although have achieved goodperformance. This paper introduces an elegantly simple yet effective strategynamed Thumbnail Layout (TALL), which transforms a video clip into a pre-definedlayout to realize the preservation of spatial and temporal dependencies. Thistransformation process involves sequentially masking frames at the samepositions within each frame. These frames are then resized into sub-frames andreorganized into the predetermined layout, forming thumbnails. TALL ismodel-agnostic and has remarkable simplicity, necessitating only minimal codemodifications. Furthermore, we introduce a graph reasoning block (GRB) andsemantic consistency (SC) loss to strengthen TALL, culminating in TALL++. GRBenhances interactions between different semantic regions to capturesemantic-level inconsistency clues. The semantic consistency loss imposesconsistency constraints on semantic features to improve model generalizationability. Extensive experiments on intra-dataset, cross-dataset,diffusion-generated image detection, and deepfake generation method recognitionshow that TALL++ achieves results surpassing or comparable to thestate-of-the-art methods, demonstrating the effectiveness of our approaches forvarious deepfake detection problems. The code is available athttps://github.com/rainy-xu/TALL4Deepfake.

Translated text

Key words

Forgery detection,Thumbnail,Spatiotemporal inconsistency,Graph reasoning,Vision transformer

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

Summary is being generated by the instructions you defined