SEMar 28, 2019

An Empirical Study of Obsolete Answers on Stack Overflow

arXiv:1903.12282v189 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of outdated information misleading software developers on Stack Overflow, but it is incremental as it identifies characteristics without proposing new solutions.

The study investigated how answers on Stack Overflow become obsolete, finding that 58.4% were likely obsolete when first posted, only 20.5% are updated when obsolete, and certain tags like node.js are more prone to obsolescence.

Stack Overflow accumulates an enormous amount of software engineering knowledge. However, as time passes, certain knowledge in answers may become obsolete. Such obsolete answers, if not identified or documented clearly, may mislead answer seekers and cause unexpected problems (e.g., using an out-dated security protocol). In this paper, we investigate how the knowledge in answers becomes obsolete and identify the characteristics of such obsolete answers. We find that: 1) More than half of the obsolete answers (58.4%) were probably already obsolete when they were first posted. 2) When an obsolete answer is observed, only a small proportion (20.5%) of such answers are ever updated. 3) Answers to questions in certain tags (e.g., node.js, ajax, android, and objective-c) are more likely to become obsolete. Our findings suggest that Stack Overflow should develop mechanisms to encourage the whole community to maintain answers (to avoid obsolete answers) and answer seekers are encouraged to carefully go through all information (e.g., comments) in answer threads.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes