CVSep 1, 2020

SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization

arXiv:2009.00726v2295 citations
Originality Highly original
AI Analysis

This work addresses the challenge of image manipulation localization, which is important for applications like media forensics and security, by proposing a novel method that improves accuracy in this domain-specific task.

The paper tackles the problem of detecting and localizing multiple types of image manipulations by introducing the Spatial Pyramid Attention Network (SPAN), which uses a pyramid of local self-attention blocks and a novel position projection to model relationships between image patches at multiple scales, resulting in significant performance gains over previous state-of-the-art methods on standard datasets.

We present a novel framework, Spatial Pyramid Attention Network (SPAN) for detection and localization of multiple types of image manipulations. The proposed architecture efficiently and effectively models the relationship between image patches at multiple scales by constructing a pyramid of local self-attention blocks. The design includes a novel position projection to encode the spatial positions of the patches. SPAN is trained on a generic, synthetic dataset but can also be fine tuned for specific datasets; The proposed method shows significant gains in performance on standard datasets over previous state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes