SEApr 6, 2018Code
Towards Identifying Paid Open Source Developers - A Case Study with Mozilla DevelopersMaëlick Claes, Mika Mäntylä, Miikka Kuutila et al.
Open source development contains contributions from both hired and volunteer software developers. Identification of this status is important when we consider the transferability of research results to the closed source software industry, as they include no volunteer developers. While many studies have taken the employment status of developers into account, this information is often gathered manually due to the lack of accurate automatic methods. In this paper, we present an initial step towards predicting paid and unpaid open source development using machine learning and compare our results with automatic techniques used in prior work. By relying on code source repository meta-data from Mozilla, and manually collected employment status, we built a dataset of the most active developers, both volunteer and hired by Mozilla. We define a set of metrics based on developers' usual commit time pattern and use different classification methods (logistic regression, classification tree, and random forest). The results show that our proposed method identify paid and unpaid commits with an AUC of 0.75 using random forest, which is higher than the AUC of 0.64 obtained with the best of the previously used automatic methods.
SEFeb 14, 2018Code
Do Programmers Work at Night or During the Weekend?Maëlick Claes, Mika Mäntylä, Miikka Kuutila et al.
Abnormal working hours can reduce work health, general well-being, and productivity, independent from a profession. To inform future approaches for automatic stress and overload detection, this paper establishes empirically collected measures of the work patterns of software engineers. To this aim, we perform the first large-scale study of software engineers' working hours by investigating the time stamps of commit activities of 86 large open source software projects, both containing hired and volunteer developers. We find that two thirds of software engineers mainly follow typical office hours, empirically established to be from 10h to 18h, and do not usually work during nights and weekends. Large variations between projects and individuals exist. Surprisingly, we found no support that project maturation would decrease abnormal working hours. In the Firefox case study, we found that hired developers work more during office hours while seniority, either in terms of number of commits or job status, did not impact working hours. We conclude that the use of working hours or timestamps of work products for stress detection requires establishing baselines at the level of individuals.
SEApr 12, 2017Code
Abnormal Working Hours: Effect of Rapid Releases and Implications to Work ContentMaëlick Claes, Mika Mäntylä, Miikka Kuutila et al.
During the past years, overload at work leading to psychological diseases, such as burnouts, have drawn more public attention. This paper is a preliminary step toward an analysis of the work patterns and possible indicators of overload and time pressure on software developers with mining software repositories approach. We explore the working pattern of developers in the context of Mozilla Firefox, a large and long-lived open source project. To that end we investigate the impact of the move from traditional to rapid release cycle on work pattern. Moreover we compare Mozilla Firefox work pattern with another Mozilla product, Firefox OS, which has a different release cycle than Firefox. We find that both projects exhibit healthy working patterns, i.e. lower activity during the weekends and outside of office hours. Firefox experiences proportionally more activity on weekends than Firefox OS (Cohen's d = 0.94). We find that switching to rapid releases has reduced weekend work (Cohen's d = 1.43) and working during the night (Cohen's d = 0.45). This result holds even when we limit the analyzes on the hired resources, i.e. considering only individuals with Mozilla foundation email address, although, the effect sizes are smaller for weekends (Cohen's d = 0.64) and nights (Cohen's d = 0.23). Moreover, we use dissimilarity word clouds and find that work during the weekend is more technical while work during the week expresses more positive sentiment with words like "good" and "nice". Our results suggest that moving to rapid releases have positive impact on the work health and work-life-balance of software engineers. However, caution is needed as our results are based on a limited set of quantitative data from a single organization.
SEApr 21, 2020
Chat activity is a better predictor than chat sentiment on software developers productivityMiikka Kuutila, Mika Mäntylä, Maëlick Claes
Recent works have proposed that software developers' positive emotion has a positive impact on software developers' productivity. In this paper we investigate two data sources: developers chat messages (from Slack and Hipchat) and source code commits of a single co-located Agile team over 200 working days. Our regression analysis shows that the number of chat messages is the best predictor and predicts productivity measured both in the number of commits and lines of code with $R^2$ of 0.33 and 0.27 respectively. We then add sentiment analysis variables until AIC of our model no longer improves and gets $R^2$ values of 0.37 (commits) and 0.30 (lines of code). Thus, analyzing chat sentiment improves productivity prediction over chat activity alone but the difference is not massive. This work supports the idea that emotional state and productivity are linked in software development. We find that three positive sentiment metrics, but surprisingly also one negative sentiment metric is associated with higher productivity.
SEMar 31, 2020
20-MAD -- 20 Years of Issues and Commits of Mozilla and Apache DevelopmentMaëlick Claes, Mika Mäntylä
Data of long-lived and high profile projects is valuable for research on successful software engineering in the wild. Having a dataset with different linked software repositories of such projects, enables deeper diving investigations. This paper presents 20-MAD, a dataset linking the commit and issue data of Mozilla and Apache projects. It includes over 20 years of information about 765 projects, 3.4M commits, 2.3M issues, and 17.3M issue comments, and its compressed size is over 6 GB. The data contains all the typical information about source code commits (e.g., lines added and removed, message and commit time) and issues (status, severity, votes, and summary). The issue comments have been pre-processed for natural language processing and sentiment analysis. This includes emoticons and valence and arousal scores. Linking code repository and issue tracker information, allows studying individuals in two types of repositories and provide more accurate time zone information for issue trackers as well. To our knowledge, this the largest linked dataset in size and in project lifetime that is not based on GitHub.
SEJan 17, 2019
Time Pressure in Software Engineering: A Systematic ReviewMiikka Kuutila, Mika Mäntylä, Umar Farooq et al.
Large project overruns and overtime work have been reported in the software industry, resulting in additional expense for companies and personal issues for developers. The present work aims to provide an overview of studies related to time pressure in software engineering; specifically, existing definitions, possible causes, and metrics relevant to time pressure were collected, and a mapping of the studies to software processes and approaches was performed. Moreover, we synthesize results of existing quantitative studies on the effects of time pressure on software development, and offer practical takeaways for practitioners and researchers, based on empirical evidence. Our search strategy examined 5,414 sources, found through repository searches and snowballing. Applying inclusion and exclusion criteria resulted in the selection of 102 papers, which made relevant contributions related to time pressure in software engineering. The majority of high quality studies report increased productivity and decreased quality under time pressure. Frequent categories of studies focus on quality assurance, cost estimation, and process simulation. It appears that time pressure is usually caused by errors in cost estimation. The effect of time pressure is most often identified during software quality assurance. The majority of empirical studies report increased productivity under time pressure, while the most cost estimation and process simulation models assume that compressing the schedule increases the total needed hours. We also find evidence of the mediating effect of knowledge on the effects of time pressure, and that tight deadlines impact tasks with an algorithmic nature more severely. Future research should better contextualize quantitative studies to account for the existing conflicting results and to provide an understanding of situations when time pressure is either beneficial or harmful.
SEAug 31, 2018
On the Use of Emoticons in Open Source Software DevelopmentMaëlick Claes, Mika Mäntylä, Umar Farooq
Background: Using sentiment analysis to study software developers' behavior comes with challenges such as the presence of a large amount of technical discussion unlikely to express any positive or negative sentiment. However, emoticons provide information about developer sentiments that can easily be extracted from software repositories. Aim: We investigate how software developers use emoticons differently in issue trackers in order to better understand the differences between developers and determine to which extent emoticons can be used as in place of sentiment analysis. Method: We extract emoticons from 1.3M comments from Apache's issue tracker and 4.5M from Mozilla's issue tracker using regular expressions built from a list of emoticons used by SentiStrength and Wikipedia. We check for statistical differences using Mann-Whitney U tests and determine the effect size with Cliff's delta. Results: Overall Mozilla developers rely more on emoticons than Apache developers. While the overall ratio of comments with emoticons is of 2% and 3.6% for Apache and Mozilla, some individual developers can have a ratio above 20%. Looking specifically at Mozilla developers, we find that western developers use significantly more emoticons (with large size effect) than eastern developers. While the majority of emoticons are used to express joy, we find that Mozilla developers use emoticons more frequently to express sadness and surprise than Apache developers. Finally, we find that developers use overall more emoticons during weekends than during weekdays, with the share of sad and surprised emoticons increasing during weekends. Conclusions: While emoticons are primarily used to express joy, the more occasional use of sad and surprised emoticons can potentially be utilized to detect frustration in place of sentiment analysis among developers using emoticons frequently enough.
CLAug 24, 2018
Measuring LDA Topic Stability from Clusters of Replicated RunsMika Mäntylä, Maëlick Claes, Umar Farooq
Background: Unstructured and textual data is increasing rapidly and Latent Dirichlet Allocation (LDA) topic modeling is a popular data analysis methods for it. Past work suggests that instability of LDA topics may lead to systematic errors. Aim: We propose a method that relies on replicated LDA runs, clustering, and providing a stability metric for the topics. Method: We generate k LDA topics and replicate this process n times resulting in n*k topics. Then we use K-medioids to cluster the n*k topics to k clusters. The k clusters now represent the original LDA topics and we present them like normal LDA topics showing the ten most probable words. For the clusters, we try multiple stability metrics, out of which we recommend Rank-Biased Overlap, showing the stability of the topics inside the clusters. Results: We provide an initial validation where our method is used for 270,000 Mozilla Firefox commit messages with k=20 and n=20. We show how our topic stability metrics are related to the contents of the topics. Conclusions: Advances in text mining enable us to analyze large masses of text in software engineering but non-deterministic algorithms, such as LDA, may lead to unreplicable conclusions. Our approach makes LDA stability transparent and is also complementary rather than alternative to many prior works that focus on LDA parameter tuning.
SEAug 16, 2018
Using Experience Sampling to link Software Repositories with Emotions and Work Well-BeingMiikka Kuutila, Mika Mäntylä, Maëlick Claes et al.
Background: The experience sampling method studies everyday experiences of humans in natural environments. In psychology it has been used to study the relationships between work well-being and productivity. To our best knowledge, daily experience sampling has not been previously used in software engineering. Aims: Our aim is to identify links between software developers self-reported affective states and work well-being and measures obtained from software repositories. Method: We perform an experience sampling study in a software company for a period of eight months, we use logistic regression to link the well-being measures with development activities, i.e. number of commits and chat messages. Results: We find several significant relationships between questionnaire variables and software repository variables. To our surprise relationship between hurry and number of commits is negative, meaning more perceived hurry is linked with a smaller number of commits. We also find a negative relationship between social interaction and hindered work well-being. Conclusions: The negative link between commits and hurry is counter-intuitive and goes against previous lab-experiments in software engineering that show increased efficiency under time pressure. Overall, our work is an initial step in using experience sampling in software engineering and validating theories on work well-being from other fields in the domain of software engineering.
SEMar 27, 2017
Bootstrapping a Lexicon for Emotional Arousal in Software EngineeringMika V. Mäntylä, Nicole Novielli, Filippo Lanubile et al.
Emotional arousal increases activation and performance but may also lead to burnout in software development. We present the first version of a Software Engineering Arousal lexicon (SEA) that is specifically designed to address the problem of emotional arousal in the software developer ecosystem. SEA is built using a bootstrapping approach that combines word embedding model trained on issue-tracking data and manual scoring of items in the lexicon. We show that our lexicon is able to differentiate between issue priorities, which are a source of emotional activation and then act as a proxy for arousal. The best performance is obtained by combining SEA (428 words) with a previously created general purpose lexicon by Warriner et al. (13,915 words) and it achieves Cohen's d effect sizes up to 0.5.
SEMar 13, 2017
Reviewing Literature on Time Pressure in Software Engineering and Related Professions - Computer Assisted Interdisciplinary Literature ReviewMiikka Kuutila, Mika V. Mäntylä, Maëlick Claes et al.
During the past years, psychological diseases related to unhealthy work environments, such as burnouts, have drawn more and more public attention. One of the known causes of these affective problems is time pressure. In order to form a theoretical background for time pressure detection in software repositories, this paper combines interdisciplinary knowledge by analyzing 1270 papers found on Scopus database and containing terms related to time pressure. By clustering those papers based on their abstract, we show that time pressure has been widely studied across different fields, but relatively little in software engineering. From a literature review of the most relevant papers, we infer a list of testable hypotheses that we want to verify in future studies in order to assess the impact of time pressures on software developers mental health.