CL LGOct 30, 2023

Partial Tensorized Transformers for Natural Language Processing

arXiv:2310.20077v10.91 citationsh-index: 3

Originality Incremental advance

AI Analysis

This work addresses efficiency issues in NLP and vision-language models, offering a novel compression method that enhances accuracy, though it is incremental in the context of tensor decomposition techniques.

The authors tackled the high memory and parameter requirements of transformer models like BERT and ViT by applying tensor-train decomposition, achieving up to 5% accuracy improvement without post-training adjustments.

The transformer architecture has revolutionized Natural Language Processing (NLP) and other machine-learning tasks, due to its unprecedented accuracy. However, their extensive memory and parameter requirements often hinder their practical applications. In this work, we study the effect of tensor-train decomposition to improve the accuracy and compress transformer vision-language neural networks, namely BERT and ViT. We focus both on embedding-layer compression and partial tensorization of neural networks (PTNN) through an algorithmic approach. Our novel PTNN approach significantly improves the accuracy of existing models by up to 5%, all without the need for post-training adjustments, breaking new ground in the field of tensor decomposition.

View on arXiv PDF

Similar