CLAIMay 20, 2025

Enhanced Multimodal Aspect-Based Sentiment Analysis by LLM-Generated Rationales

arXiv:2505.14499v23 citationsh-index: 3ICONIP
Originality Incremental advance
AI Analysis

This addresses the challenge of limited capacity in existing MABSA methods for researchers and practitioners, though it is incremental as it builds on prior work by integrating LLM-generated rationales.

The paper tackles the problem of inaccurate aspect and sentiment identification in Multimodal Aspect-Based Sentiment Analysis (MABSA) by proposing a framework that combines small language models (SLMs) with rationales generated by large language models (LLMs), resulting in superior performance on three benchmarks.

There has been growing interest in Multimodal Aspect-Based Sentiment Analysis (MABSA) in recent years. Existing methods predominantly rely on pre-trained small language models (SLMs) to collect information related to aspects and sentiments from both image and text, with an aim to align these two modalities. However, small SLMs possess limited capacity and knowledge, often resulting in inaccurate identification of meaning, aspects, sentiments, and their interconnections in textual and visual data. On the other hand, Large language models (LLMs) have shown exceptional capabilities in various tasks by effectively exploring fine-grained information in multimodal data. However, some studies indicate that LLMs still fall short compared to fine-tuned small models in the field of ABSA. Based on these findings, we propose a novel framework, termed LRSA, which combines the decision-making capabilities of SLMs with additional information provided by LLMs for MABSA. Specifically, we inject explanations generated by LLMs as rationales into SLMs and employ a dual cross-attention mechanism for enhancing feature interaction and fusion, thereby augmenting the SLMs' ability to identify aspects and sentiments. We evaluated our method using two baseline models, numerous experiments highlight the superiority of our approach on three widely-used benchmarks, indicating its generalizability and applicability to most pre-trained models for MABSA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes