Challenges for Open-domain Targeted Sentiment Analysis
This work addresses the problem of improving sentiment analysis across diverse domains and document lengths for researchers and practitioners in NLP, though it is incremental as it builds on existing methods with new data.
The authors tackled the problem of limited domain variety and sentence-level analysis in open-domain targeted sentiment analysis by creating a new dataset of 6,013 human-labeled documents with nested target annotations. Benchmark results revealed significant room for improvement in this task, with challenges including effective use of open-domain data, long documents, complex target structures, and domain variances.
Since previous studies on open-domain targeted sentiment analysis are limited in dataset domain variety and sentence level, we propose a novel dataset consisting of 6,013 human-labeled data to extend the data domains in topics of interest and document level. Furthermore, we offer a nested target annotation schema to extract the complete sentiment information in documents, boosting the practicality and effectiveness of open-domain targeted sentiment analysis. Moreover, we leverage the pre-trained model BART in a sequence-to-sequence generation method for the task. Benchmark results show that there exists large room for improvement of open-domain targeted sentiment analysis. Meanwhile, experiments have shown that challenges remain in the effective use of open-domain data, long documents, the complexity of target structure, and domain variances.