Title: |
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization |
Authors: |
Chen, Yirui, Huang, Xudong, Zhang, Quan, Li, Wei, Zhu, Mingjian, Yan, Qiangyu, Li, Simiao, Chen, Hanting, Hu, Hailin, Yang, Jie, Liu, Wei, Hu, Jie |
Publication Year: |
2024 |
Collection: |
Computer Science |
Subject Terms: |
Computer Science - Computer Vision and Pattern Recognition |
More Details: |
The extraordinary ability of generative models emerges as a new trend in image editing and generating realistic images, posing a serious threat to the trustworthiness of multimedia data and driving the research of image manipulation detection and location (IMDL). However, the lack of a large-scale data foundation makes the IMDL task unattainable. In this paper, we build a local manipulation data generation pipeline that integrates the powerful capabilities of SAM, LLM, and generative models. Upon this basis, we propose the GIM dataset, which has the following advantages: 1) Large scale, GIM includes over one million pairs of AI-manipulated images and real images. 2) Rich image content, GIM encompasses a broad range of image classes. 3) Diverse generative manipulation, the images are manipulated images with state-of-the-art generators and various manipulation tasks. The aforementioned advantages allow for a more comprehensive evaluation of IMDL methods, extending their applicability to diverse images. We introduce the GIM benchmark with two settings to evaluate existing IMDL methods. In addition, we propose a novel IMDL framework, termed GIMFormer, which consists of a ShadowTracer, Frequency-Spatial block (FSB), and a Multi-Window Anomalous Modeling (MWAM) module. Extensive experiments on the GIM demonstrate that GIMFormer surpasses the previous state-of-the-art approach on two different benchmarks. Comment: Code page: https://github.com/chenyirui/GIM |
Document Type: |
Working Paper |
Access URL: |
http://arxiv.org/abs/2406.16531 |
Accession Number: |
edsarx.2406.16531 |
Database: |
arXiv |