CVSep 18, 2014

Deformable Part Models are Convolutional Neural Networks

arXiv:1409.5403v2468 citations
Originality Highly original
AI Analysis

This provides a synthesis of two major visual recognition tools, improving object detection performance and speed for computer vision applications.

The paper shows that deformable part models (DPMs) can be formulated as convolutional neural networks (CNNs), enabling the replacement of standard features with learned ones, resulting in DeepPyramid DPM which outperforms HOG-based DPMs and slightly beats R-CNN on PASCAL VOC while running much faster.

Deformable part models (DPMs) and convolutional neural networks (CNNs) are two widely used tools for visual recognition. They are typically viewed as distinct approaches: DPMs are graphical models (Markov random fields), while CNNs are "black-box" non-linear classifiers. In this paper, we show that a DPM can be formulated as a CNN, thus providing a novel synthesis of the two ideas. Our construction involves unrolling the DPM inference algorithm and mapping each step to an equivalent (and at times novel) CNN layer. From this perspective, it becomes natural to replace the standard image features used in DPM with a learned feature extractor. We call the resulting model DeepPyramid DPM and experimentally validate it on PASCAL VOC. DeepPyramid DPM significantly outperforms DPMs based on histograms of oriented gradients features (HOG) and slightly outperforms a comparable version of the recently introduced R-CNN detection system, while running an order of magnitude faster.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes