MLLGOct 6, 2021

VC dimension of partially quantized neural networks in the overparametrized regime

arXiv:2110.02456v11 citations
Originality Incremental advance
AI Analysis

This addresses the theoretical gap in VC theory for overparametrized networks, with potential implications for understanding generalization in deep learning, though it is incremental as it focuses on a specific class of quantized networks.

The paper tackles the problem of explaining the small generalization error of overparametrized neural networks by analyzing the VC dimension of partially quantized networks (HANNs), showing they can have significantly smaller VC dimension than the number of weights while achieving the minimax rate for classification and matching state-of-the-art performance on 121 UCI datasets.

Vapnik-Chervonenkis (VC) theory has so far been unable to explain the small generalization error of overparametrized neural networks. Indeed, existing applications of VC theory to large networks obtain upper bounds on VC dimension that are proportional to the number of weights, and for a large class of networks, these upper bound are known to be tight. In this work, we focus on a class of partially quantized networks that we refer to as hyperplane arrangement neural networks (HANNs). Using a sample compression analysis, we show that HANNs can have VC dimension significantly smaller than the number of weights, while being highly expressive. In particular, empirical risk minimization over HANNs in the overparametrized regime achieves the minimax rate for classification with Lipschitz posterior class probability. We further demonstrate the expressivity of HANNs empirically. On a panel of 121 UCI datasets, overparametrized HANNs match the performance of state-of-the-art full-precision models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes