CVJul 23, 2015

Deep Fishing: Gradient Features from Deep Nets

arXiv:1507.06429v11 citations
Originality Incremental advance
AI Analysis

This work addresses image recognition by bridging deep learning and traditional gradient-based methods, offering incremental improvements in performance.

The paper connects ConvNets and Fisher Vectors by deriving gradient representations from ConvNets, showing they form a structured matrix for efficient similarity computation, and demonstrates consistent improvements on Pascal VOC 2007 and 2012 datasets.

Convolutional Networks (ConvNets) have recently improved image recognition performance thanks to end-to-end learning of deep feed-forward models from raw pixels. Deep learning is a marked departure from the previous state of the art, the Fisher Vector (FV), which relied on gradient-based encoding of local hand-crafted features. In this paper, we discuss a novel connection between these two approaches. First, we show that one can derive gradient representations from ConvNets in a similar fashion to the FV. Second, we show that this gradient representation actually corresponds to a structured matrix that allows for efficient similarity computation. We experimentally study the benefits of transferring this representation over the outputs of ConvNet layers, and find consistent improvements on the Pascal VOC 2007 and 2012 datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes