CL IR LGOct 10, 2016

Supervised Term Weighting Metrics for Sentiment Analysis in Short Text

Hussam Hamdan, Patrice Bellot, Frederic Bechet

arXiv:1610.03106v10.83 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of improving sentiment analysis accuracy for short texts like reviews and tweets, but it is incremental as it focuses on empirical analysis rather than introducing new methods.

The study analyzed fifteen global supervised term weighting metrics and four local metrics to identify characteristics distinguishing effective from ineffective metrics for sentiment analysis in short text, evaluating them with Support Vector Machines on Twitter, restaurant, and laptop review datasets.

Term weighting metrics assign weights to terms in order to discriminate the important terms from the less crucial ones. Due to this characteristic, these metrics have attracted growing attention in text classification and recently in sentiment analysis. Using the weights given by such metrics could lead to more accurate document representation which may improve the performance of the classification. While previous studies have focused on proposing or comparing different weighting metrics at two-classes document level sentiment analysis, this study propose to analyse the results given by each metric in order to find out the characteristics of good and bad weighting metrics. Therefore we present an empirical study of fifteen global supervised weighting metrics with four local weighting metrics adopted from information retrieval, we also give an analysis to understand the behavior of each metric by observing and analysing how each metric distributes the terms and deduce some characteristics which may distinguish the good and bad metrics. The evaluation has been done using Support Vector Machine on three different datasets: Twitter, restaurant and laptop reviews.

View on arXiv PDF

Similar