Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks
This work addresses the problem of reducing reliance on large labeled datasets for computer vision tasks, offering an incremental improvement in unsupervised feature learning.
The paper tackles the challenge of learning generic features without labeled data by training a convolutional network to discriminate between surrogate classes formed from transformed image patches, achieving state-of-the-art unsupervised learning results on datasets like STL-10 and outperforming SIFT in geometric matching.
Deep convolutional networks have proven to be very successful in learning task specific features that allow for unprecedented performance on various computer vision tasks. Training of such networks follows mostly the supervised learning paradigm, where sufficiently many input-output pairs are required for training. Acquisition of large training sets is one of the key challenges, when approaching a new task. In this paper, we aim for generic feature learning and present an approach for training a convolutional network using only unlabeled data. To this end, we train the network to discriminate between a set of surrogate classes. Each surrogate class is formed by applying a variety of transformations to a randomly sampled 'seed' image patch. In contrast to supervised network training, the resulting feature representation is not class specific. It rather provides robustness to the transformations that have been applied during training. This generic feature representation allows for classification results that outperform the state of the art for unsupervised learning on several popular datasets (STL-10, CIFAR-10, Caltech-101, Caltech-256). While such generic features cannot compete with class specific features from supervised training on a classification task, we show that they are advantageous on geometric matching problems, where they also outperform the SIFT descriptor.