CLJul 13, 2025

Balanced Training Data Augmentation for Aspect-Based Sentiment Analysis

arXiv:2507.09485v16 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses data scarcity and imbalance in aspect-based sentiment analysis, which is crucial for fine-grained sentiment analysis in social media, but it is incremental as it builds on existing LLM-based augmentation methods.

The paper tackles the problem of small and unbalanced training data in aspect-based sentiment analysis by proposing an LLM-based data augmentation method optimized with reinforcement learning, achieving superior performance over strong baselines on benchmark datasets.

Aspect-based sentiment analysis (ABSA) is a crucial fine-grained task in social media scenarios to identify the sentiment polarity of specific aspect terms in a sentence. Although many existing studies leverage large language models (LLMs) to perform ABSA due to their strong context understanding capabilities, they still face challenges to learn the context information in the running text because of the short text, as well as the small and unbalanced labeled training data, where most data are labeled with positive sentiment. Data augmentation (DA) is a feasible strategy for providing richer contextual information, especially when using LLMs to create synthetic training data, but faces challenges in ensuring a high quality of the augmented data.In this paper, we propose an LLM-based ABSA approach with training data augmentation.Specifically, an LLM is prompted to generate augmented training data based on the original training data, so as to construct a new training data with larger size and balanced label distributions to better train an ABSA model. Meanwhile, in order to improve the quality of the augmented data, we propose a reinforcement learning approach to optimize the data augmentation. LLM.Experiment results and further analyses on English benchmark datasets for ABSA demonstrate the effectiveness of our approach, where superior performance is observed over strong baselines and most existing studies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes