CVNov 20, 2022

Auto-Focus Contrastive Learning for Image Manipulation Detection

arXiv:2211.10922v11 citationsh-index: 34
Originality Incremental advance
AI Analysis

This addresses the need for more accurate detection of manipulated images, which is crucial for applications like forensics and media verification, though it appears incremental as it builds on existing contrastive learning methods.

The paper tackles the problem of sub-optimal image manipulation detection by proposing an Auto-Focus Contrastive Learning network that automatically focuses on manipulated regions and explores trace relations, achieving performance improvements of up to 2.5%, 7.5%, and 0.8% F1 score on three datasets.

Generally, current image manipulation detection models are simply built on manipulation traces. However, we argue that those models achieve sub-optimal detection performance as it tends to: 1) distinguish the manipulation traces from a lot of noisy information within the entire image, and 2) ignore the trace relations among the pixels of each manipulated region and its surroundings. To overcome these limitations, we propose an Auto-Focus Contrastive Learning (AF-CL) network for image manipulation detection. It contains two main ideas, i.e., multi-scale view generation (MSVG) and trace relation modeling (TRM). Specifically, MSVG aims to generate a pair of views, each of which contains the manipulated region and its surroundings at a different scale, while TRM plays a role in modeling the trace relations among the pixels of each manipulated region and its surroundings for learning the discriminative representation. After learning the AF-CL network by minimizing the distance between the representations of corresponding views, the learned network is able to automatically focus on the manipulated region and its surroundings and sufficiently explore their trace relations for accurate manipulation detection. Extensive experiments demonstrate that, compared to the state-of-the-arts, AF-CL provides significant performance improvements, i.e., up to 2.5%, 7.5%, and 0.8% F1 score, on CAISA, NIST, and Coverage datasets, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes