LGMLJul 9, 2020

Untapped Potential of Data Augmentation: A Domain Generalization Viewpoint

arXiv:2007.04662v11 citations
AI Analysis

This work addresses a fundamental issue in machine learning for practitioners using data augmentation, but it is incremental as it builds on existing augmentation techniques without proposing a new solution.

The paper tackles the problem of data augmentation not learning robust features as assumed, even with state-of-the-art methods, by adopting a Domain Generalization viewpoint to probe overfitting and identify improvement opportunities.

Data augmentation is a popular pre-processing trick to improve generalization accuracy. It is believed that by processing augmented inputs in tandem with the original ones, the model learns a more robust set of features which are shared between the original and augmented counterparts. However, we show that is not the case even for the best augmentation technique. In this work, we take a Domain Generalization viewpoint of augmentation based methods. This new perspective allowed for probing overfitting and delineating avenues for improvement. Our exploration with the state-of-art augmentation method provides evidence that the learned representations are not as robust even towards distortions used during training. This suggests evidence for the untapped potential of augmented examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes