LGAIOct 1, 2021

Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)

arXiv:2110.00684v15 citations
Originality Incremental advance
AI Analysis

This addresses the need for more efficient and stable pruning techniques in deep learning, though it appears incremental as it builds on existing structured pruning methods.

The paper tackled the problem of instability and resource-intensive fine-tuning in structured pruning of neural networks by proposing DiscriminAtive Masking (DAM), a single-stage method that achieved state-of-the-art performance across applications like image classification and graph representation learning.

A central goal in deep learning is to learn compact representations of features at every layer of a neural network, which is useful for both unsupervised representation learning and structured network pruning. While there is a growing body of work in structured pruning, current state-of-the-art methods suffer from two key limitations: (i) instability during training, and (ii) need for an additional step of fine-tuning, which is resource-intensive. At the core of these limitations is the lack of a systematic approach that jointly prunes and refines weights during training in a single stage, and does not require any fine-tuning upon convergence to achieve state-of-the-art performance. We present a novel single-stage structured pruning method termed DiscriminAtive Masking (DAM). The key intuition behind DAM is to discriminatively prefer some of the neurons to be refined during the training process, while gradually masking out other neurons. We show that our proposed DAM approach has remarkably good performance over various applications, including dimensionality reduction, recommendation system, graph representation learning, and structured pruning for image classification. We also theoretically show that the learning objective of DAM is directly related to minimizing the L0 norm of the masking layer.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes