CVLGNEJun 16, 2021

Evolving Image Compositions for Feature Representation Learning

arXiv:2106.09011v27 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for computer vision researchers and practitioners seeking better data augmentation techniques.

The paper tackles the problem of data augmentation for visual recognition by proposing PatchMix, a method that creates new training samples by composing patches from pairs of images in grid-like patterns, resulting in improved transfer learning capabilities and performance gains such as +1.16 on ImageNet.

Convolutional neural networks for visual recognition require large amounts of training samples and usually benefit from data augmentation. This paper proposes PatchMix, a data augmentation method that creates new samples by composing patches from pairs of images in a grid-like pattern. These new samples are assigned label scores that are proportional to the number of patches borrowed from each image. We then add a set of additional losses at the patch-level to regularize and to encourage good representations at both the patch and image levels. A ResNet-50 model trained on ImageNet using PatchMix exhibits superior transfer learning capabilities across a wide array of benchmarks. Although PatchMix can rely on random pairings and random grid-like patterns for mixing, we explore evolutionary search as a guiding strategy to jointly discover optimal grid-like patterns and image pairings. For this purpose, we conceive a fitness function that bypasses the need to re-train a model to evaluate each possible choice. In this way, PatchMix outperforms a base model on CIFAR-10 (+1.91), CIFAR-100 (+5.31), Tiny Imagenet (+3.52), and ImageNet (+1.16).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes