CLLGASJun 28, 2022

Bottleneck Low-rank Transformers for Low-resource Spoken Language Understanding

arXiv:2206.14318v18 citationsh-index: 35
Originality Incremental advance
AI Analysis

This work addresses the challenge of model size for on-edge spoken language understanding in low-resource settings, presenting an incremental improvement in efficiency.

The paper tackles the problem of deploying large transformer models for spoken language understanding in low-resource, on-edge applications by proposing a lean transformer structure that reduces attention dimension via group sparsity and transfers learned subspaces to a bottleneck layer, achieving competitive accuracies with pre-trained large models without pre-training.

End-to-end spoken language understanding (SLU) systems benefit from pretraining on large corpora, followed by fine-tuning on application-specific data. The resulting models are too large for on-edge applications. For instance, BERT-based systems contain over 110M parameters. Observing the model is overparameterized, we propose lean transformer structure where the dimension of the attention mechanism is automatically reduced using group sparsity. We propose a variant where the learned attention subspace is transferred to an attention bottleneck layer. In a low-resource setting and without pre-training, the resulting compact SLU model achieves accuracies competitive with pre-trained large models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes