CVAIGRLGApr 25, 2016

Context Encoders: Feature Learning by Inpainting

arXiv:1604.07379v25732 citations
Originality Highly original
AI Analysis

This addresses the problem of learning visual features without labeled data for computer vision researchers, offering a novel unsupervised method that is not incremental.

The paper tackles unsupervised visual feature learning by training convolutional neural networks to predict missing image regions based on context, achieving effective feature representations for classification, detection, and segmentation tasks with quantitative improvements.

We present an unsupervised visual feature learning algorithm driven by context-based pixel prediction. By analogy with auto-encoders, we propose Context Encoders -- a convolutional neural network trained to generate the contents of an arbitrary image region conditioned on its surroundings. In order to succeed at this task, context encoders need to both understand the content of the entire image, as well as produce a plausible hypothesis for the missing part(s). When training context encoders, we have experimented with both a standard pixel-wise reconstruction loss, as well as a reconstruction plus an adversarial loss. The latter produces much sharper results because it can better handle multiple modes in the output. We found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures. We quantitatively demonstrate the effectiveness of our learned features for CNN pre-training on classification, detection, and segmentation tasks. Furthermore, context encoders can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes