LGDIS-NNSTAT-MECHMay 20, 2020

Beyond the storage capacity: data driven satisfiability transition

arXiv:2005.09992v118 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding data structure's impact on neural network expressivity and generalization bounds for the machine learning community, offering incremental insights beyond traditional storage capacity frameworks.

The study investigated how data structure affects neural network properties by computing the Vapnik-Chervonenkis entropy for kernel machines with grouped data, finding non-monotonic entropy behavior and an additional critical point beyond storage capacity, which also occurs in margin classifiers with random labels.

Data structure has a dramatic impact on the properties of neural networks, yet its significance in the established theoretical frameworks is poorly understood. Here we compute the Vapnik-Chervonenkis entropy of a kernel machine operating on data grouped into equally labelled subsets. At variance with the unstructured scenario, entropy is non-monotonic in the size of the training set, and displays an additional critical point besides the storage capacity. Remarkably, the same behavior occurs in margin classifiers even with randomly labelled data, as is elucidated by identifying the synaptic volume encoding the transition. These findings reveal aspects of expressivity lying beyond the condensed description provided by the storage capacity, and they indicate the path towards more realistic bounds for the generalization error of neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes