CVLGNov 19, 2018

Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition

arXiv:1811.07503v1114 citations
Originality Incremental advance
AI Analysis

This addresses scalability issues in training RNNs for action recognition, but it is incremental as it builds on existing tensor decomposition methods.

The paper tackles the high memory and computational cost of RNNs in handling high-dimensional input data like video by proposing TR-LSTM, a compact LSTM model using tensor ring decomposition, which shows promising performance on action recognition datasets compared to tensor train LSTM and other state-of-the-art methods.

Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling. The hidden layers in RNNs can be regarded as the memory units, which are helpful in storing information in sequential contexts. However, when dealing with high dimensional input data, such as video and text, the input-to-hidden linear transformation in RNNs brings high memory usage and huge computational cost. This makes the training of RNNs unscalable and difficult. To address this challenge, we propose a novel compact LSTM model, named as TR-LSTM, by utilizing the low-rank tensor ring decomposition (TRD) to reformulate the input-to-hidden transformation. Compared with other tensor decomposition methods, TR-LSTM is more stable. In addition, TR-LSTM can complete an end-to-end training and also provide a fundamental building block for RNNs in handling large input data. Experiments on real-world action recognition datasets have demonstrated the promising performance of the proposed TR-LSTM compared with the tensor train LSTM and other state-of-the-art competitors.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes