CVNov 14, 2017

Capturing Localized Image Artifacts through a CNN-based Hyper-image Representation

arXiv:1711.04945v1
Originality Incremental advance
AI Analysis

This work addresses a problem in computer vision for applications like image quality assessment and tampering detection, offering an incremental improvement over existing patch-based methods.

The paper tackles the challenge of capturing localized image artifacts with deep CNNs on small datasets by introducing a two-stage CNN that uses hyper-image representations to better model spatial correlations among patches, resulting in performance improvements over strong baselines in no-reference image quality estimation and image tampering detection tasks.

Training deep CNNs to capture localized image artifacts on a relatively small dataset is a challenging task. With enough images at hand, one can hope that a deep CNN characterizes localized artifacts over the entire data and their effect on the output. However, on smaller datasets, such deep CNNs may overfit and shallow ones find it hard to capture local artifacts. Thus some image-based small-data applications first train their framework on a collection of patches (instead of the entire image) to better learn the representation of localized artifacts. Then the output is obtained by averaging the patch-level results. Such an approach ignores the spatial correlation among patches and how various patch locations affect the output. It also fails in cases where few patches mainly contribute to the image label. To combat these scenarios, we develop the notion of hyper-image representations. Our CNN has two stages. The first stage is trained on patches. The second stage utilizes the last layer representation developed in the first stage to form a hyper-image, which is used to train the second stage. We show that this approach is able to develop a better mapping between the image and its output. We analyze additional properties of our approach and show its effectiveness on one synthetic and two real-world vision tasks - no-reference image quality estimation and image tampering detection - by its performance improvement over existing strong baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes