Cross-dimensional Weighting for Aggregated Deep Convolutional Features
This work addresses image search performance for computer vision applications, but it is incremental as it builds on existing pre-trained network approaches.
The authors tackled the problem of creating powerful image representations by proposing a cross-dimensional weighting and aggregation method for deep convolutional features, which outperformed the state-of-the-art on public image search datasets.
We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs. We first present a generalized framework that encompasses a broad family of approaches and includes cross-dimensional pooling and weighting steps. We then propose specific non-parametric schemes for both spatial- and channel-wise weighting that boost the effect of highly active spatial responses and at the same time regulate burstiness effects. We experiment on different public datasets for image search and show that our approach outperforms the current state-of-the-art for approaches based on pre-trained networks. We also provide an easy-to-use, open source implementation that reproduces our results.