Bayesian Statistics in Software Engineering: Practical Guide and Case Studies
This provides software engineering researchers and practitioners with accessible Bayesian methods to enhance data analysis, though it is incremental in applying existing statistical techniques to this domain.
The paper tackles the dominance of frequentist statistics in empirical software engineering by providing a practical guide to Bayesian statistics and applying it to case studies on agile vs. structured development, programming language performance, and random testing of object-oriented programs, yielding insights beyond original frequentist results.
Statistics comes in two main flavors: frequentist and Bayesian. For historical and technical reasons, frequentist statistics has dominated data analysis in the past; but Bayesian statistics is making a comeback at the forefront of science. In this paper, we give a practical overview of Bayesian statistics and illustrate its main advantages over frequentist statistics for the kinds of analyses that are common in empirical software engineering, where frequentist statistics still is standard. We also apply Bayesian statistics to empirical data from previous research investigating agile vs. structured development processes, the performance of programming languages, and random testing of object-oriented programs. In addition to being case studies demonstrating how Bayesian analysis can be applied in practice, they provide insights beyond the results in the original publications (which used frequentist statistics), thus showing the practical value brought by Bayesian statistics.