CLOct 26, 2023

Learning to Abstract with Nonparametric Variational Information Bottleneck

Melika Behjati, Fabio Fehr, James Henderson

arXiv:2310.17284v121.2133 citationsh-index: 33Has Code

Originality Highly original

AI Analysis

This work addresses the costly need for separate models for different abstraction levels in NLP, offering a more efficient and robust approach.

The paper tackles the problem of learning tokenization-specific textual embeddings for different abstraction levels by introducing a language representation model that learns to compress to various abstraction levels within the same model using Nonparametric Variational Information Bottleneck (NVIB). The result shows that layers correspond to increasing abstraction, produce more linguistically informed representations, and enhance robustness to adversarial perturbations.

Learned representations at the level of characters, sub-words, words and sentences, have each contributed to advances in understanding different NLP tasks and linguistic phenomena. However, learning textual embeddings is costly as they are tokenization specific and require different models to be trained for each level of abstraction. We introduce a novel language representation model which can learn to compress to different levels of abstraction at different layers of the same model. We apply Nonparametric Variational Information Bottleneck (NVIB) to stacked Transformer self-attention layers in the encoder, which encourages an information-theoretic compression of the representations through the model. We find that the layers within the model correspond to increasing levels of abstraction and that their representations are more linguistically informed. Finally, we show that NVIB compression results in a model which is more robust to adversarial perturbations.

View on arXiv PDF Code

Similar