LGAIApr 26, 2023

Tensor Decomposition for Model Reduction in Neural Networks: A Review

arXiv:2304.13539v133 citationsh-index: 72
Originality Synthesis-oriented
AI Analysis

It addresses the problem of high computational costs in neural networks for deployment on edge devices, but it is incremental as it reviews existing methods.

This paper reviews six tensor decomposition methods for compressing over-parameterized neural networks like CNNs, RNNs, and Transformers, showing they can reduce model size, runtime, and energy consumption while sometimes increasing accuracy.

Modern neural networks have revolutionized the fields of computer vision (CV) and Natural Language Processing (NLP). They are widely used for solving complex CV tasks and NLP tasks such as image classification, image generation, and machine translation. Most state-of-the-art neural networks are over-parameterized and require a high computational cost. One straightforward solution is to replace the layers of the networks with their low-rank tensor approximations using different tensor decomposition methods. This paper reviews six tensor decomposition methods and illustrates their ability to compress model parameters of convolutional neural networks (CNNs), recurrent neural networks (RNNs) and Transformers. The accuracy of some compressed models can be higher than the original versions. Evaluations indicate that tensor decompositions can achieve significant reductions in model size, run-time and energy consumption, and are well suited for implementing neural networks on edge devices.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes