CLSep 15, 2017

Unsupervised Aspect Term Extraction with B-LSTM & CRF using Automatically Labelled Datasets

Athanasios Giannakopoulos, Claudiu Musat, Andreea Hossmann, Michael Baeriswyl

arXiv:1709.05094v139.31095 citations

Originality Incremental advance

AI Analysis

This addresses the costly human annotation and limited datasets for aspect term extraction, offering an efficient unsupervised solution for sentiment analysis tasks.

The paper tackles the problem of unsupervised aspect term extraction by introducing an architecture that achieves top-ranking supervised performance and a method to automatically construct datasets, resulting in dramatically higher F-scores and precision above 80% compared to baselines.

Aspect Term Extraction (ATE) identifies opinionated aspect terms in texts and is one of the tasks in the SemEval Aspect Based Sentiment Analysis (ABSA) contest. The small amount of available datasets for supervised ATE and the costly human annotation for aspect term labelling give rise to the need for unsupervised ATE. In this paper, we introduce an architecture that achieves top-ranking performance for supervised ATE. Moreover, it can be used efficiently as feature extractor and classifier for unsupervised ATE. Our second contribution is a method to automatically construct datasets for ATE. We train a classifier on our automatically labelled datasets and evaluate it on the human annotated SemEval ABSA test sets. Compared to a strong rule-based baseline, we obtain a dramatically higher F-score and attain precision values above 80%. Our unsupervised method beats the supervised ABSA baseline from SemEval, while preserving high precision scores.

View on arXiv PDF

Similar