Second-order Democratic Aggregation
This work addresses the challenge of feature aggregation for researchers in computer vision, offering incremental improvements over existing methods for tasks like texture generation and fine-grained recognition.
The paper tackles the problem of aggregating second-order features in deep networks by introducing γ-democratic aggregators that interpolate between sum and democratic pooling, achieving state-of-the-art performance on several classification tasks with improved efficiency and low-dimensional computation.
Aggregated second-order features extracted from deep convolutional networks have been shown to be effective for texture generation, fine-grained recognition, material classification, and scene understanding. In this paper, we study a class of orderless aggregation functions designed to minimize interference or equalize contributions in the context of second-order features and we show that they can be computed just as efficiently as their first-order counterparts and they have favorable properties over aggregation by summation. Another line of work has shown that matrix power normalization after aggregation can significantly improve the generalization of second-order representations. We show that matrix power normalization implicitly equalizes contributions during aggregation thus establishing a connection between matrix normalization techniques and prior work on minimizing interference. Based on the analysis we present γ-democratic aggregators that interpolate between sum (γ=1) and democratic pooling (γ=0) outperforming both on several classification tasks. Moreover, unlike power normalization, the γ-democratic aggregations can be computed in a low dimensional space by sketching that allows the use of very high-dimensional second-order features. This results in a state-of-the-art performance on several datasets.