CLLGSISOC-PHSep 18, 2021

Augmenting semantic lexicons using word embeddings and transfer learning

arXiv:2109.09010v29 citations
AI Analysis

This work addresses the problem of efficiently updating sentiment lexicons for researchers and developers using lexicon-based models, though it is incremental as it builds on existing embedding and transfer learning techniques.

The authors tackled the challenge of expanding sentiment lexicons automatically by proposing two models: a baseline shallow neural network and an improved deep Transformer-based model that uses word definitions. Their evaluation showed both models could score new words with accuracy comparable to human reviewers from Amazon Mechanical Turk, but at a fraction of the cost.

Sentiment-aware intelligent systems are essential to a wide array of applications. These systems are driven by language models which broadly fall into two paradigms: Lexicon-based and contextual. Although recent contextual models are increasingly dominant, we still see demand for lexicon-based models because of their interpretability and ease of use. For example, lexicon-based models allow researchers to readily determine which words and phrases contribute most to a change in measured sentiment. A challenge for any lexicon-based approach is that the lexicon needs to be routinely expanded with new words and expressions. Here, we propose two models for automatic lexicon expansion. Our first model establishes a baseline employing a simple and shallow neural network initialized with pre-trained word embeddings using a non-contextual approach. Our second model improves upon our baseline, featuring a deep Transformer-based network that brings to bear word definitions to estimate their lexical polarity. Our evaluation shows that both models are able to score new words with a similar accuracy to reviewers from Amazon Mechanical Turk, but at a fraction of the cost.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes