CVNov 29, 2017

Patch Correspondences for Interpreting Pixel-level CNNs

arXiv:1711.10683v42 citations
Originality Incremental advance
AI Analysis

This provides a tool for researchers and practitioners to better understand and control CNN embeddings in pixel-level tasks, though it is incremental as it builds on existing patch-match methods.

The paper tackles the problem of interpreting distributed representations in CNNs for pixel-level tasks by introducing compositional nearest neighbors (CompNN), which reconstructs input and output images by copy-pasting patches from the training set based on similar feature embeddings, and demonstrates its utility in semantic segmentation and image-to-image translation.

We present compositional nearest neighbors (CompNN), a simple approach to visually interpreting distributed representations learned by a convolutional neural network (CNN) for pixel-level tasks (e.g., image synthesis and segmentation). It does so by reconstructing both a CNN's input and output image by copy-pasting corresponding patches from the training set with similar feature embeddings. To do so efficiently, it makes of a patch-match-based algorithm that exploits the fact that the patch representations learned by a CNN for pixel level tasks vary smoothly. Finally, we show that CompNN can be used to establish semantic correspondences between two images and control properties of the output image by modifying the images contained in the training set. We present qualitative and quantitative experiments for semantic segmentation and image-to-image translation that demonstrate that CompNN is a good tool for interpreting the embeddings learned by pixel-level CNNs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes