SECLSep 9, 2017

Sentiment Polarity Detection for Software Development

arXiv:1709.02984v2249 citations
Originality Synthesis-oriented
AI Analysis

This work addresses sentiment analysis for software developers' communications, offering a domain-specific tool to improve accuracy in technical contexts, though it is incremental as it adapts existing methods to a new domain.

The authors tackled the problem of off-the-shelf sentiment analysis tools misclassifying technical jargon in software developers' communications by developing Senti4SD, a classifier trained on a manually annotated gold standard from Stack Overflow, which reduced misclassifications of neutral and positive posts as negative compared to a baseline tool.

The role of sentiment analysis is increasingly emerging to study software developers' emotions by mining crowd-generated content within social software engineering tools. However, off-the-shelf sentiment analysis tools have been trained on non-technical domains and general-purpose social media, thus resulting in misclassifications of technical jargon and problem reports. Here, we present Senti4SD, a classifier specifically trained to support sentiment analysis in developers' communication channels. Senti4SD is trained and validated using a gold standard of Stack Overflow questions, answers, and comments manually annotated for sentiment polarity. It exploits a suite of both lexicon- and keyword-based features, as well as semantic features based on word embedding. With respect to a mainstream off-the-shelf tool, which we use as a baseline, Senti4SD reduces the misclassifications of neutral and positive posts as emotionally negative. To encourage replications, we release a lab package including the classifier, the word embedding space, and the gold standard with annotation guidelines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes