OCLGMLApr 24, 2020

Nonconvex regularization for sparse neural networks

arXiv:2004.11515v28 citations
AI Analysis

This addresses the issue of constructing sparse neural networks with theoretical guarantees for researchers and practitioners in machine learning, though it appears incremental as it builds on prior convex regularization methods.

The paper tackles the problem of over-parametrization and loss of sparsity in neural networks using convex ℓ₁ regularization by proposing a nonconvex regularization method for shallow ReLU networks, proving that it ensures finite networks even with infinite data while maintaining approximation guarantees and existing bounds on network size.

Convex $\ell_1$ regularization using an infinite dictionary of neurons has been suggested for constructing neural networks with desired approximation guarantees, but can be affected by an arbitrary amount of over-parametrization. This can lead to a loss of sparsity and result in networks with too many active neurons for the given data, in particular if the number of data samples is large. As a remedy, in this paper, a nonconvex regularization method is investigated in the context of shallow ReLU networks: We prove that in contrast to the convex approach, any resulting (locally optimal) network is finite even in the presence of infinite data (i.e., if the data distribution is known and the limiting case of infinite samples is considered). Moreover, we show that approximation guarantees and existing bounds on the network size for finite data are maintained.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes