CVFeb 28, 2012

Fast approximations to structured sparse coding and applications to object classification

arXiv:1202.6384v139 citations
Originality Incremental advance
AI Analysis

This work addresses the computational bottleneck in real-time object classification for computer vision applications, though it is incremental as it builds on existing sparse coding frameworks.

The paper tackles the problem of slow sparse coding for object recognition by introducing a fast approximation method using a binary decision tree and learned dictionaries, achieving 20 frames per second on laptop hardware with minimal accuracy loss on Caltech 101 and 15 scenes benchmarks.

We describe a method for fast approximation of sparse coding. The input space is subdivided by a binary decision tree, and we simultaneously learn a dictionary and assignment of allowed dictionary elements for each leaf of the tree. We store a lookup table with the assignments and the pseudoinverses for each node, allowing for very fast inference. We give an algorithm for learning the tree, the dictionary and the dictionary element assignment, and In the process of describing this algorithm, we discuss the more general problem of learning the groups in group structured sparse modelling. We show that our method creates good sparse representations by using it in the object recognition framework of \cite{lazebnik06,yang-cvpr-09}. Implementing our own fast version of the SIFT descriptor the whole system runs at 20 frames per second on $321 \times 481$ sized images on a laptop with a quad-core cpu, while sacrificing very little accuracy on the Caltech 101 and 15 scenes benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes