CVLGJan 22, 2020

How Much Position Information Do Convolutional Neural Networks Encode?

arXiv:2001.08248v1397 citations
AI Analysis

This addresses a fundamental limitation in understanding CNN behavior for researchers and practitioners, though it is incremental in exploring implicit learning mechanisms.

The paper investigates whether deep convolutional neural networks (CNNs) implicitly encode absolute position information in images, revealing that they surprisingly capture a significant degree of such information, as demonstrated through comprehensive experiments.

In contrast to fully connected networks, Convolutional Neural Networks (CNNs) achieve efficiency by learning weights associated with local filters with a finite spatial extent. An implication of this is that a filter may know what it is looking at, but not where it is positioned in the image. Information concerning absolute position is inherently useful, and it is reasonable to assume that deep CNNs may implicitly learn to encode this information if there is a means to do so. In this paper, we test this hypothesis revealing the surprising degree of absolute position information that is encoded in commonly used neural networks. A comprehensive set of experiments show the validity of this hypothesis and shed light on how and where this information is represented while offering clues to where positional information is derived from in deep CNNs.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes