LGMLAug 20, 2017

Improving Deep Learning using Generic Data Augmentation

arXiv:1708.06020v1394 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of expensive data collection for deep learning by providing guidance on effective data augmentation methods, though it is incremental as it benchmarks existing schemes rather than introducing new ones.

The study benchmarks popular data augmentation schemes to determine their impact on Convolutional Neural Network performance, finding that cropping in geometric augmentation significantly increases accuracy, with specific improvements reported in Top-1 and Top-5 metrics.

Deep artificial neural networks require a large corpus of training data in order to effectively learn, where collection of such training data is often expensive and laborious. Data augmentation overcomes this issue by artificially inflating the training set with label preserving transformations. Recently there has been extensive use of generic data augmentation to improve Convolutional Neural Network (CNN) task performance. This study benchmarks various popular data augmentation schemes to allow researchers to make informed decisions as to which training methods are most appropriate for their data sets. Various geometric and photometric schemes are evaluated on a coarse-grained data set using a relatively simple CNN. Experimental results, run using 4-fold cross-validation and reported in terms of Top-1 and Top-5 accuracy, indicate that cropping in geometric augmentation significantly increases CNN task performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes