LGAIOct 21, 2021

Lottery Tickets with Nonzero Biases

arXiv:2110.11150v28 citations
Originality Incremental advance
AI Analysis

This work addresses a theoretical limitation in pruning methods for deep neural networks, offering incremental improvements for researchers and practitioners in efficient model compression.

The paper tackles the gap in the strong lottery ticket hypothesis by extending initialization schemes and existence proofs to networks with nonzero biases, enabling truly orthogonal parameter initialization and reducing pruning errors. Experiments on standard benchmarks demonstrate practical benefits and theoretically inspired extensions for state-of-the-art pruning.

The strong lottery ticket hypothesis holds the promise that pruning randomly initialized deep neural networks could offer a computationally efficient alternative to deep learning with stochastic gradient descent. Common parameter initialization schemes and existence proofs, however, are focused on networks with zero biases, thus foregoing the potential universal approximation property of pruning. To fill this gap, we extend multiple initialization schemes and existence proofs to nonzero biases, including explicit 'looks-linear' approaches for ReLU activation functions. These do not only enable truly orthogonal parameter initialization but also reduce potential pruning errors. In experiments on standard benchmark data, we further highlight the practical benefits of nonzero bias initialization schemes, and present theoretically inspired extensions for state-of-the-art strong lottery ticket pruning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes