CVMay 29, 2015

Learning to count with deep object features

arXiv:1505.08082v184 citations
AI Analysis

This work addresses the challenge of object counting in computer vision, but it is incremental as it builds on existing learning-to-count frameworks by analyzing learned features.

The paper tackled the problem of counting object instances in scenes by training a convolutional neural network for regression, and found that the network's internal features could classify MNIST digits without direct supervision and showed preliminary results for pedestrian counting.

Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural network in order to understand their underlying representation. To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training. We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes