CVMar 25, 2020

Circumventing Outliers of AutoAugment with Knowledge Distillation

arXiv:2003.11342v168 citations
AI Analysis

This work addresses the issue of training instability and noise in data augmentation for image classification, offering an incremental improvement over AutoAugment.

The paper tackled the problem of AutoAugment's sensitivity to hyper-parameters and its tendency to remove discriminative information from images, which can degrade network optimization. By using knowledge distillation to guide training with a teacher model's output, they achieved a new state-of-the-art top-1 accuracy of 85.8% on ImageNet classification.

AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks, yet it is sensitive to the operator space as well as hyper-parameters, and an improper setting may degenerate network optimization. This paper delves deep into the working mechanism, and reveals that AutoAugment may remove part of discriminative information from the training image and so insisting on the ground-truth label is no longer the best option. To relieve the inaccuracy of supervision, we make use of knowledge distillation that refers to the output of a teacher model to guide network training. Experiments are performed in standard image classification benchmarks, and demonstrate the effectiveness of our approach in suppressing noise of data augmentation and stabilizing training. Upon the cooperation of knowledge distillation and AutoAugment, we claim the new state-of-the-art on ImageNet classification with a top-1 accuracy of 85.8%.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes