CVIRJan 19, 2018

Image Provenance Analysis at Scale

arXiv:1801.06510v271 citations
Originality Incremental advance
AI Analysis

This addresses the need for fact-checking and authorship verification in media manipulation, such as fake news, by providing a scalable solution for image provenance analysis.

The paper tackles the problem of image provenance analysis at scale by developing an end-to-end pipeline to retrieve original images and transformation sequences from a query image, achieving state-of-the-art results in experiments with published datasets and introducing a new Reddit dataset with baseline results.

Prior art has shown it is possible to estimate, through image processing and computer vision techniques, the types and parameters of transformations that have been applied to the content of individual images to obtain new images. Given a large corpus of images and a query image, an interesting further step is to retrieve the set of original images whose content is present in the query image, as well as the detailed sequences of transformations that yield the query image given the original images. This is a problem that recently has received the name of image provenance analysis. In these times of public media manipulation ( e.g., fake news and meme sharing), obtaining the history of image transformations is relevant for fact checking and authorship verification, among many other applications. This article presents an end-to-end processing pipeline for image provenance analysis, which works at real-world scale. It employs a cutting-edge image filtering solution that is custom-tailored for the problem at hand, as well as novel techniques for obtaining the provenance graph that expresses how the images, as nodes, are ancestrally connected. A comprehensive set of experiments for each stage of the pipeline is provided, comparing the proposed solution with state-of-the-art results, employing previously published datasets. In addition, this work introduces a new dataset of real-world provenance cases from the social media site Reddit, along with baseline results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes