Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
This work addresses the limited generalization of ASTE benchmarks and models, which is a problem for researchers and practitioners in sentiment analysis, though it is incremental in expanding evaluation scope and improving decoding methods.
The paper tackles the generalization problem in Aspect Sentiment Triplet Extraction (ASTE) by introducing a domain-expanded benchmark for evaluating models in in-domain and out-of-domain settings and proposing CASE, a decoding strategy that enhances the trustworthiness and performance of large language models in ASTE.
Aspect Sentiment Triplet Extraction (ASTE) is a challenging task in sentiment analysis, aiming to provide fine-grained insights into human sentiments. However, existing benchmarks are limited to two domains and do not evaluate model performance on unseen domains, raising concerns about the generalization of proposed methods. Furthermore, it remains unclear if large language models (LLMs) can effectively handle complex sentiment tasks like ASTE. In this work, we address the issue of generalization in ASTE from both a benchmarking and modeling perspective. We introduce a domain-expanded benchmark by annotating samples from diverse domains, enabling evaluation of models in both in-domain and out-of-domain settings. Additionally, we propose CASE, a simple and effective decoding strategy that enhances trustworthiness and performance of LLMs in ASTE. Through comprehensive experiments involving multiple tasks, settings, and models, we demonstrate that CASE can serve as a general decoding strategy for complex sentiment tasks. By expanding the scope of evaluation and providing a more reliable decoding strategy, we aim to inspire the research community to reevaluate the generalizability of benchmarks and models for ASTE. Our code, data, and models are available at https://github.com/DAMO-NLP-SG/domain-expanded-aste.