CLJun 9, 2016

Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora

arXiv:1606.02820v2340 citations
Originality Incremental advance
AI Analysis

This addresses the need for accurate sentiment lexicons in computational social science, enabling large-scale studies of sentiment variation across domains, though it is incremental as it builds on existing embedding and propagation techniques.

The paper tackled the problem of domain-specific sentiment analysis by developing a method to automatically generate sentiment lexicons from unlabeled corpora, achieving state-of-the-art performance competitive with hand-curated approaches. It applied this to historical and community data, finding that over 5% of sentiment-bearing English words switched polarity over 150 years and sentiment varies drastically across online communities.

A word's sentiment depends on the domain in which it is used. Computational social science research thus requires sentiment lexicons that are specific to the domains being studied. We combine domain-specific word embeddings with a label propagation framework to induce accurate domain-specific sentiment lexicons using small sets of seed words, achieving state-of-the-art performance competitive with approaches that rely on hand-curated resources. Using our framework we perform two large-scale empirical studies to quantify the extent to which sentiment varies across time and between communities. We induce and release historical sentiment lexicons for 150 years of English and community-specific sentiment lexicons for 250 online communities from the social media forum Reddit. The historical lexicons show that more than 5% of sentiment-bearing (non-neutral) English words completely switched polarity during the last 150 years, and the community-specific lexicons highlight how sentiment varies drastically between different communities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes