LGCVMLFeb 20, 2020

Affinity and Diversity: Quantifying Mechanisms of Data Augmentation

arXiv:2002.08973v286 citations
AI Analysis

This work addresses a fundamental gap in machine learning theory for researchers and practitioners, though it is incremental as it builds on existing heuristic approaches.

The paper tackled the problem of understanding why data augmentation improves model generalization by introducing interpretable measures called Affinity and Diversity, finding that performance is best predicted by optimizing both together rather than individually.

Though data augmentation has become a standard component of deep neural network training, the underlying mechanism behind the effectiveness of these techniques remains poorly understood. In practice, augmentation policies are often chosen using heuristics of either distribution shift or augmentation diversity. Inspired by these, we seek to quantify how data augmentation improves model generalization. To this end, we introduce interpretable and easy-to-compute measures: Affinity and Diversity. We find that augmentation performance is predicted not by either of these alone but by jointly optimizing the two.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes