LGFeb 28, 2018

Tensor Decomposition for Compressing Recurrent Neural Network

Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

arXiv:1802.10410v210.156 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the issue of high parameter requirements in RNNs for researchers and practitioners in machine learning, but it is incremental as it applies existing tensor decomposition techniques to a specific RNN variant.

The paper tackled the problem of reducing parameters in Recurrent Neural Networks (RNNs) while maintaining performance by applying tensor decomposition methods like CP, Tucker, and Tensor Train to Gated Recurrent Units (GRUs). It found that TT-GRU achieved the best results across different parameter counts in sequence modeling tasks.

In the machine learning fields, Recurrent Neural Network (RNN) has become a popular architecture for sequential data modeling. However, behind the impressive performance, RNNs require a large number of parameters for both training and inference. In this paper, we are trying to reduce the number of parameters and maintain the expressive power from RNN simultaneously. We utilize several tensor decompositions method including CANDECOMP/PARAFAC (CP), Tucker decomposition and Tensor Train (TT) to re-parameterize the Gated Recurrent Unit (GRU) RNN. We evaluate all tensor-based RNNs performance on sequence modeling tasks with a various number of parameters. Based on our experiment results, TT-GRU achieved the best results in a various number of parameters compared to other decomposition methods.

View on arXiv PDF Code

Similar