LG AI SD ASMar 11, 2022

Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on Riemannian Gradient Descent With Illustrations of Speech Processing

Jun Qi, Chao-Han Huck Yang, Pin-Yu Chen, Javier Tejedor

NVIDIA

arXiv:2203.06031v111.124 citationsh-index: 22Has Code

Originality Incremental advance

AI Analysis

This work addresses model efficiency for speech processing applications, presenting an incremental improvement over existing tensor-train methods.

The paper tackles the problem of high model complexity in tensor-train deep neural networks (TT-DNNs) by designing low-rank variants using Riemannian gradient descent, achieving better performance with fewer parameters in speech enhancement and spoken command recognition tasks.

This work focuses on designing low complexity hybrid tensor networks by considering trade-offs between the model complexity and practical performance. Firstly, we exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN. Secondly, a hybrid model combining LR-TT-DNN with a convolutional neural network (CNN), which is denoted as CNN+(LR-TT-DNN), is set up to boost the performance. Instead of randomly assigning large TT-ranks for TT-DNN, we leverage Riemannian gradient descent to determine a TT-DNN associated with small TT-ranks. Furthermore, CNN+(LR-TT-DNN) consists of convolutional layers at the bottom for feature extraction and several TT layers at the top to solve regression and classification problems. We separately assess the LR-TT-DNN and CNN+(LR-TT-DNN) models on speech enhancement and spoken command recognition tasks. Our empirical evidence demonstrates that the LR-TT-DNN and CNN+(LR-TT-DNN) models with fewer model parameters can outperform the TT-DNN and CNN+(TT-DNN) counterparts.

View on arXiv PDF Code

Similar