CLFeb 6, 2017

Q-WordNet PPV: Simple, Robust and (almost) Unsupervised Generation of Polarity Lexicons for Multiple Languages

Iñaki San Vicente, Rodrigo Agerri, German Rigau

arXiv:1702.01711v11.044 citations

Originality Incremental advance

AI Analysis

This addresses the need for robust sentiment analysis tools in various languages, especially where annotated data is scarce, though it is incremental as it builds on existing dictionary-based and PageRank methods.

The paper tackled the problem of automatically generating polarity lexicons for multiple languages without domain-specific annotated data, and the result was that their qwn-ppv method outperformed other automatically generated lexicons in extrinsic evaluations and showed competitive performance against manually annotated ones.

This paper presents a simple, robust and (almost) unsupervised dictionary-based method, qwn-ppv (Q-WordNet as Personalized PageRanking Vector) to automatically generate polarity lexicons. We show that qwn-ppv outperforms other automatically generated lexicons for the four extrinsic evaluations presented here. It also shows very competitive and robust results with respect to manually annotated ones. Results suggest that no single lexicon is best for every task and dataset and that the intrinsic evaluation of polarity lexicons is not a good performance indicator on a Sentiment Analysis task. The qwn-ppv method allows to easily create quality polarity lexicons whenever no domain-based annotated corpora are available for a given language.

View on arXiv PDF

Similar