CVLGJun 10, 2014

Why do linear SVMs trained on HOG features perform so well?

arXiv:1406.2419v145 citations
Originality Synthesis-oriented
AI Analysis

This addresses the performance gap in visual perception tasks like pedestrian detection, though it is incremental in explaining existing methods.

The paper investigates why linear SVMs with HOG features excel in visual tasks, showing that HOG induces capacity and adds prior to the SVM by preserving local second-order statistics, and demonstrates surprising accuracy in expression recognition and pedestrian detection.

Linear Support Vector Machines trained on HOG features are now a de facto standard across many visual perception tasks. Their popularisation can largely be attributed to the step-change in performance they brought to pedestrian detection, and their subsequent successes in deformable parts models. This paper explores the interactions that make the HOG-SVM symbiosis perform so well. By connecting the feature extraction and learning processes rather than treating them as disparate plugins, we show that HOG features can be viewed as doing two things: (i) inducing capacity in, and (ii) adding prior to a linear SVM trained on pixels. From this perspective, preserving second-order statistics and locality of interactions are key to good performance. We demonstrate surprising accuracy on expression recognition and pedestrian detection tasks, by assuming only the importance of preserving such local second-order interactions.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes