SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge
This addresses sentiment analysis tasks for NLP practitioners, offering an incremental improvement by incorporating existing linguistic resources into pre-trained models.
The authors tackled the problem of pre-trained language models lacking linguistic knowledge by proposing SentiLARE, which integrates word-level part-of-speech and sentiment polarity from SentiWordNet, resulting in new state-of-the-art performance on various sentiment analysis tasks.
Most of the existing pre-trained language representation models neglect to consider the linguistic knowledge of texts, which can promote language understanding in NLP tasks. To benefit the downstream tasks in sentiment analysis, we propose a novel language representation model called SentiLARE, which introduces word-level linguistic knowledge including part-of-speech tag and sentiment polarity (inferred from SentiWordNet) into pre-trained models. We first propose a context-aware sentiment attention mechanism to acquire the sentiment polarity of each word with its part-of-speech tag by querying SentiWordNet. Then, we devise a new pre-training task called label-aware masked language model to construct knowledge-aware language representation. Experiments show that SentiLARE obtains new state-of-the-art performance on a variety of sentiment analysis tasks.