CVJun 11, 2021

AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation

arXiv:2106.06250v110 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the problem of reducing reliance on labeled data for computer vision tasks, offering a competitive unsupervised approach for researchers and practitioners, though it appears incremental as it builds on existing augmentation-based methods.

The authors tackled unsupervised visual representation learning by proposing AugNet, a method that learns image features from unlabeled pictures using similarities between augmented versions, achieving over 60% accuracy on STL10 and 27% on CIFAR100 in unsupervised clustering.

Most of the achievements in artificial intelligence so far were accomplished by supervised learning which requires numerous annotated training data and thus costs innumerable manpower for labeling. Unsupervised learning is one of the effective solutions to overcome such difficulties. In our work, we propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures. We develop a method to construct the similarities between pictures as distance metrics in the embedding space by leveraging the inter-correlation between augmented versions of samples. Our experiments demonstrate that the method is able to represent the image in low dimensional space and performs competitively in downstream tasks such as image classification and image similarity comparison. Specifically, we achieved over 60% and 27% accuracy on the STL10 and CIFAR100 datasets with unsupervised clustering, respectively. Moreover, unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets to train the feature extractor, but still shows comparable or even better feature representation ability and easy-to-use characteristics. In our evaluations, the method outperforms all the state-of-the-art image retrieval algorithms on some out-of-domain image datasets. The code for the model implementation is available at https://github.com/chenmingxiang110/AugNet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes