CLNov 19, 2021

Does BERT look at sentiment lexicon?

Elena Razova, Sergey Vychegzhanin, Evgeny Kotelnikov

arXiv:2111.10100v10.54 citations

Originality Synthesis-oriented

AI Analysis

This addresses the interpretability gap in deep neural networks for sentiment analysis, providing insights into how they differ from rule-based methods, though it is incremental as it focuses on a specific model and language.

The study investigated whether BERT models consider sentiment lexicon by analyzing attention weights in the Russian-language RuBERT model, finding that about 75% of attention heads paid more attention to sentiment lexicon compared to neutral lexicon.

The main approaches to sentiment analysis are rule-based methods and ma-chine learning, in particular, deep neural network models with the Trans-former architecture, including BERT. The performance of neural network models in the tasks of sentiment analysis is superior to the performance of rule-based methods. The reasons for this situation remain unclear due to the poor interpretability of deep neural network models. One of the main keys to understanding the fundamental differences between the two approaches is the analysis of how sentiment lexicon is taken into account in neural network models. To this end, we study the attention weights matrices of the Russian-language RuBERT model. We fine-tune RuBERT on sentiment text corpora and compare the distributions of attention weights for sentiment and neutral lexicons. It turns out that, on average, 3/4 of the heads of various model var-iants statistically pay more attention to the sentiment lexicon compared to the neutral one.

View on arXiv PDF

Similar