MLCVLGMay 22, 2018

Deformable Part Networks

arXiv:1805.08808v1
Originality Highly original
AI Analysis

This addresses the challenge of pose-invariant object recognition in computer vision, offering a novel method that improves over existing approaches.

The paper tackles the problem of learning pose-invariant representations for 2D object recognition by proposing Deformable Part Networks (DPNs), which outperform CapsNets and STNs by 19.19% and 12.75% respectively on affNIST with better generalization and tolerance to affine transformations.

In this paper we propose novel Deformable Part Networks (DPNs) to learn {\em pose-invariant} representations for 2D object recognition. In contrast to the state-of-the-art pose-aware networks such as CapsNet \cite{sabour2017dynamic} and STN \cite{jaderberg2015spatial}, DPNs can be naturally {\em interpreted} as an efficient solver for a challenging detection problem, namely Localized Deformable Part Models (LDPMs) where localization is introduced to DPMs as another latent variable for searching for the best poses of objects over all pixels and (predefined) scales. In particular we construct DPNs as sequences of such LDPM units to model the semantic and spatial relations among the deformable parts as hierarchical composition and spatial parsing trees. Empirically our 17-layer DPN can outperform both CapsNets and STNs significantly on affNIST \cite{sabour2017dynamic}, for instance, by 19.19\% and 12.75\%, respectively, with better generalization and better tolerance to affine transformations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes