CVLGApr 7, 2022

Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results

arXiv:2204.03475v213 citationsh-index: 12Has Code
AI Analysis

This work solves the problem of expert-dependent training for computer vision researchers by automating the process, enabling methodical comparisons and identification of efficient backbones.

The authors tackled the problem of training diverse neural network architectures on ImageNet by introducing USI, a unified scheme based on knowledge distillation and modern tricks, which outperformed previous state-of-the-art results across all tested models without requiring adjustments or hyper-parameter tuning.

ImageNet serves as the primary dataset for evaluating the quality of computer-vision models. The common practice today is training each architecture with a tailor-made scheme, designed and tuned by an expert. In this paper, we present a unified scheme for training any backbone on ImageNet. The scheme, named USI (Unified Scheme for ImageNet), is based on knowledge distillation and modern tricks. It requires no adjustments or hyper-parameters tuning between different models, and is efficient in terms of training times. We test USI on a wide variety of architectures, including CNNs, Transformers, Mobile-oriented and MLP-only. On all models tested, USI outperforms previous state-of-the-art results. Hence, we are able to transform training on ImageNet from an expert-oriented task to an automatic seamless routine. Since USI accepts any backbone and trains it to top results, it also enables to perform methodical comparisons, and identify the most efficient backbones along the speed-accuracy Pareto curve. Implementation is available at:https://github.com/Alibaba-MIIL/Solving_ImageNet

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes