CVAIMar 2, 2021

Self-supervised Pretraining of Visual Features in the Wild

arXiv:2103.01988v2302 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of scaling self-supervised learning to large, uncurated datasets for computer vision researchers, confirming its practical viability beyond controlled environments.

The paper tackled the problem of whether self-supervised learning can work effectively on random, uncurated images in real-world settings, and the result was that their SEER model achieved 84.2% top-1 accuracy, surpassing the best self-supervised model by 1% and showing strong few-shot learning performance with 77.9% top-1 accuracy using only 10% of ImageNet data.

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: https://github.com/facebookresearch/vissl

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes