LGCVDec 11, 2017

StrassenNets: Deep Learning with a Multiplication Budget

arXiv:1712.03942v330 citations
Originality Highly original
AI Analysis

This addresses the high computational cost of DNNs for practitioners in fields like image classification and language modeling, offering a novel approach to efficiency without sacrificing accuracy.

The paper tackled the problem of reducing the computational cost of matrix multiplications in deep neural networks by learning low-cost approximations with a budget on multiplication operations, achieving over 99.5% reduction in multiplications while maintaining performance on ImageNet and language modeling tasks.

A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternary) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it to image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full-precision models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen's matrix multiplication algorithm, learning to multiply $2 \times 2$ matrices using only 7 multiplications instead of 8.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes