학술논문

GLFF: Global and Local Feature Fusion for AI-Synthesized Image Detection
Document Type
Periodical
Source
IEEE Transactions on Multimedia IEEE Trans. Multimedia Multimedia, IEEE Transactions on. 26:4073-4085 2024
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
General Topics for Engineers
Feature extraction
Faces
Task analysis
Image synthesis
Semantics
Generative adversarial networks
Fuses
AI-synthesized Image Detection
Synthesized Face Image Dataset
Image Forensics
Feature Fusion
Attention Mechanism
Language
ISSN
1520-9210
1941-0077
Abstract
With the rapid development of deep generative models (such as Generative Adversarial Networks and Diffusion models), AI-synthesized images are now of such high quality that humans can hardly distinguish them from pristine ones. Although existing detection methods have shown high performance in specific evaluation settings, e.g., on images from seen models or on images without real-world post-processing, they tend to suffer serious performance degradation in real-world scenarios where testing images can be generated by more powerful generation models or combined with various post-processing operations. To address this issue, we propose a Global and Local Feature Fusion (GLFF) framework to learn rich and discriminative representations by combining multi-scale global features from the whole image with refined local features from informative patches for AI-synthesized image detection. GLFF fuses information from two branches: the global branch to extract multi-scale semantic features and the local branch to select informative patches for detailed local artifacts extraction. Due to the lack of a synthesized image dataset simulating real-world applications for evaluation, we further create a challenging fake image dataset, named DeepFakeFaceForensics ($DF^{3}$), which contains 6 state-of-the-art generation models and a variety of post-processing techniques to approach the real-world scenarios. Experimental results demonstrate the superiority of our method to the state-of-the-art methods on the proposed $DF^{3}$ dataset and three other open-source datasets.