MLLGFeb 21, 2018

Detecting Learning vs Memorization in Deep Neural Networks using Shared Structure Validation Sets

arXiv:1802.07714v11 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental issue in deep learning interpretability for researchers, though it appears incremental as it builds on prior work on shuffled labels.

The paper tackled the problem of distinguishing learning from memorization in deep neural networks by proposing a permutation approach that evaluates predictive performance on validation sets sharing structure with training data, and found that DNNs can still learn even with Gaussian noise inputs.

The roles played by learning and memorization represent an important topic in deep learning research. Recent work on this subject has shown that the optimization behavior of DNNs trained on shuffled labels is qualitatively different from DNNs trained with real labels. Here, we propose a novel permutation approach that can differentiate memorization from learning in deep neural networks (DNNs) trained as usual (i.e., using the real labels to guide the learning, rather than shuffled labels). The evaluation of weather the DNN has learned and/or memorized, happens in a separate step where we compare the predictive performance of a shallow classifier trained with the features learned by the DNN, against multiple instances of the same classifier, trained on the same input, but using shuffled labels as outputs. By evaluating these shallow classifiers in validation sets that share structure with the training set, we are able to tell apart learning from memorization. Application of our permutation approach to multi-layer perceptrons and convolutional neural networks trained on image data corroborated many findings from other groups. Most importantly, our illustrations also uncovered interesting dynamic patterns about how DNNs memorize over increasing numbers of training epochs, and support the surprising result that DNNs are still able to learn, rather than only memorize, when trained with pure Gaussian noise as input.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes