CVJun 10, 2022

Masked Autoencoders are Robust Data Augmentors

arXiv:2206.04846v213.230 citationsh-index: 67Has Code
Originality Incremental advance
AI Analysis

This addresses the need for more effective regularization techniques to prevent over-fitting in vision tasks, though it is incremental as it builds on existing masked image modeling methods.

The paper tackles the problem of insufficiently challenging data augmentations in deep neural networks by proposing Mask-Reconstruct Augmentation (MRA), which uses a masked autoencoder to generate distorted views of input images, resulting in consistent performance improvements on supervised, semi-supervised, and few-shot classification benchmarks.

Deep neural networks are capable of learning powerful representations to tackle complex vision tasks but expose undesirable properties like the over-fitting issue. To this end, regularization techniques like image augmentation are necessary for deep neural networks to generalize well. Nevertheless, most prevalent image augmentation recipes confine themselves to off-the-shelf linear transformations like scale, flip, and colorjitter. Due to their hand-crafted property, these augmentations are insufficient to generate truly hard augmented examples. In this paper, we propose a novel perspective of augmentation to regularize the training process. Inspired by the recent success of applying masked image modeling to self-supervised learning, we adopt the self-supervised masked autoencoder to generate the distorted view of the input images. We show that utilizing such model-based nonlinear transformation as data augmentation can improve high-level recognition tasks. We term the proposed method as \textbf{M}ask-\textbf{R}econstruct \textbf{A}ugmentation (MRA). The extensive experiments on various image classification benchmarks verify the effectiveness of the proposed augmentation. Specifically, MRA consistently enhances the performance on supervised, semi-supervised as well as few-shot classification.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes