SE CL IRMar 23, 2012

Semi-Automatically Extracting FAQs to Improve Accessibility of Software Development Knowledge

Stefan Henß, Martin Monperrus, Mira Mezini

arXiv:1203.5188v152 citations

Originality Synthesis-oriented

AI Analysis

This addresses the need for more accessible software development knowledge by reducing the cost of FAQ creation, though it is incremental as it builds on existing text mining and NLP methods.

The paper tackled the problem of expensive FAQ creation by presenting an approach to automatically extract FAQs from software development discussions, such as mailing lists and forums, using text mining and NLP techniques, and showed through a survey that it extracts high-quality FAQs that can be refined by experts.

Frequently asked questions (FAQs) are a popular way to document software development knowledge. As creating such documents is expensive, this paper presents an approach for automatically extracting FAQs from sources of software development discussion, such as mailing lists and Internet forums, by combining techniques of text mining and natural language processing. We apply the approach to popular mailing lists and carry out a survey among software developers to show that it is able to extract high-quality FAQs that may be further improved by experts.

View on arXiv PDF

Similar