CVDec 14, 2015

Learning Deep Features for Discriminative Localization

arXiv:1512.04150v110600 citations
Originality Incremental advance
AI Analysis

This provides a method for object localization without bounding box annotations, which is useful for computer vision applications, though it builds on an existing technique.

The paper tackles the problem of enabling convolutional neural networks to localize objects using only image-level labels, by revisiting global average pooling. It achieves 37.1% top-5 error for object localization on ILSVRC 2014, close to the 34.2% error of fully supervised methods.

In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels. While this technique was previously proposed as a means for regularizing training, we find that it actually builds a generic localizable deep representation that can be applied to a variety of tasks. Despite the apparent simplicity of global average pooling, we are able to achieve 37.1% top-5 error for object localization on ILSVRC 2014, which is remarkably close to the 34.2% top-5 error achieved by a fully supervised CNN approach. We demonstrate that our network is able to localize the discriminative image regions on a variety of tasks despite not being trained for them

Code Implementations35 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes