Efficient Image Categorization with Sparse Fisher Vector
This work addresses the problem of high computational cost in image categorization for researchers and practitioners, though it is incremental as it builds on existing FV methods.
The paper tackles the computational inefficiency of Fisher vector (FV) representation in object recognition by proposing Sparse Fisher vector (SFV), which accelerates the Fisher coding step through a locality strategy, resulting in a several-fold speedup while maintaining categorization performance on benchmark datasets.
In object recognition, Fisher vector (FV) representation is one of the state-of-art image representations ways at the expense of dense, high dimensional features and increased computation time. A simplification of FV is attractive, so we propose Sparse Fisher vector (SFV). By incorporating locality strategy, we can accelerate the Fisher coding step in image categorization which is implemented from a collective of local descriptors. Combining with pooling step, we explore the relationship between coding step and pooling step to give a theoretical explanation about SFV. Experiments on benchmark datasets have shown that SFV leads to a speedup of several-fold of magnitude compares with FV, while maintaining the categorization performance. In addition, we demonstrate how SFV preserves the consistence in representation of similar local features.