Pooling-Invariant Image Feature Learning
This addresses a specific bottleneck in computer vision recognition for researchers, offering an incremental improvement in feature learning efficiency.
The paper tackles the problem of redundant features in unsupervised dictionary learning after spatial pooling in vision architectures, proposing a novel scheme that uses clustering to achieve better dictionaries than patch-based methods with the same size.
Unsupervised dictionary learning has been a key component in state-of-the-art computer vision recognition architectures. While highly effective methods exist for patch-based dictionary learning, these methods may learn redundant features after the pooling stage in a given early vision architecture. In this paper, we offer a novel dictionary learning scheme to efficiently take into account the invariance of learned features after the spatial pooling stage. The algorithm is built on simple clustering, and thus enjoys efficiency and scalability. We discuss the underlying mechanism that justifies the use of clustering algorithms, and empirically show that the algorithm finds better dictionaries than patch-based methods with the same dictionary size.