Rasmus Ros

SEFeb 10, 2021

Controlled Experimentation in Continuous Experimentation: Knowledge and Challenges

Florian Auer, Rasmus Ros, Lukas Kaltenbrunner et al.

Context: Continuous experimentation and A/B testing is an established industry practice that has been researched for more than 10 years. Our aim is to synthesize the conducted research. Objective: We wanted to find the core constituents of a framework for continuous experimentation and the solutions that are applied within the field. Finally, we were interested in the challenges and benefits reported of continuous experimentation. Method: We applied forward snowballing on a known set of papers and identified a total of 128 relevant papers. Based on this set of papers we performed two qualitative narrative syntheses and a thematic synthesis to answer the research questions. Results: The framework constituents for continuous experimentation include experimentation processes as well as supportive technical and organizational infrastructure. The solutions found in the literature were synthesized to nine themes, e.g. experiment design, automated experiments, or metric specification. Concerning the challenges of continuous experimentation, the analysis identified cultural, organizational, business, technical, statistical, ethical, and domain-specific challenges. Further, the study concludes that the benefits of experimentation are mostly implicit in the studies. Conclusions: The research on continuous experimentation has yielded a large body of knowledge on experimentation. The synthesis of published research presented within include recommended infrastructure and experimentation process models, guidelines to mitigate the identified challenges, and what problems the various published solutions solve.

CLApr 26, 2017

On Using Active Learning and Self-Training when Mining Performance Discussions on Stack Overflow

Markus Borg, Iben Lennerstad, Rasmus Ros et al.

Abundant data is the key to successful machine learning. However, supervised learning requires annotated data that are often hard to obtain. In a classification task with limited resources, Active Learning (AL) promises to guide annotators to examples that bring the most value for a classifier. AL can be successfully combined with self-training, i.e., extending a training set with the unlabelled examples for which a classifier is the most certain. We report our experiences on using AL in a systematic manner to train an SVM classifier for Stack Overflow posts discussing performance of software components. We show that the training examples deemed as the most valuable to the classifier are also the most difficult for humans to annotate. Despite carefully evolved annotation criteria, we report low inter-rater agreement, but we also propose mitigation strategies. Finally, based on one annotator's work, we show that self-training can improve the classification accuracy. We conclude the paper by discussing implication for future text miners aspiring to use AL and self-training.

Rasmus Ros

2 Papers