CVIVMay 31, 2022

CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping

Oxford
arXiv:2205.15955v110 citationsh-index: 93Has Code
Originality Incremental advance
AI Analysis

This incremental method addresses the need for better training data distributions in vision tasks, benefiting researchers and practitioners in computer vision.

The authors tackled the problem of limited information capture in single random cropping for image classification by proposing CropMix, a method that crops images at multiple scales and mixes the views to create a richer input distribution, resulting in improved performance across benchmark tasks without sacrificing efficiency.

We present a simple method, CropMix, for the purpose of producing a rich input distribution from the original dataset distribution. Unlike single random cropping, which may inadvertently capture only limited information, or irrelevant information, like pure background, unrelated objects, etc, we crop an image multiple times using distinct crop scales, thereby ensuring that multi-scale information is captured. The new input distribution, serving as training data, useful for a number of vision tasks, is then formed by simply mixing multiple cropped views. We first demonstrate that CropMix can be seamlessly applied to virtually any training recipe and neural network architecture performing classification tasks. CropMix is shown to improve the performance of image classifiers on several benchmark tasks across-the-board without sacrificing computational simplicity and efficiency. Moreover, we show that CropMix is of benefit to both contrastive learning and masked image modeling towards more powerful representations, where preferable results are achieved when learned representations are transferred to downstream tasks. Code is available at GitHub.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes