CLJan 6, 2022

Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis

arXiv:2201.02026v210 citationsHas Code
AI Analysis

This work addresses the challenge of data scarcity in sentiment analysis for NLP practitioners, offering an incremental improvement through a novel data generation method.

The paper tackles the problem of poor performance of pretrained language models in zero- or few-shot sentiment analysis by leveraging sentiment-carrying discourse markers to generate weakly-labeled data for model adaptation, resulting in improved results across various benchmark datasets, including in the finance domain.

In recent years, pretrained language models have revolutionized the NLP world, while achieving state of the art performance in various downstream tasks. However, in many cases, these models do not perform well when labeled data is scarce and the model is expected to perform in the zero or few shot setting. Recently, several works have shown that continual pretraining or performing a second phase of pretraining (inter-training) which is better aligned with the downstream task, can lead to improved results, especially in the scarce data setting. Here, we propose to leverage sentiment-carrying discourse markers to generate large-scale weakly-labeled data, which in turn can be used to adapt language models for sentiment analysis. Extensive experimental results show the value of our approach on various benchmark datasets, including the finance domain. Code, models and data are available at https://github.com/ibm/tslm-discourse-markers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes