Eleni Constantinou

SE
7papers
285citations
Novelty16%
AI Score20

7 Papers

SEMar 15, 2021Code
Does the duration of rapid release cycles affect the bug handling activity?

Thorn Jansen, Zeinab Abou Khalil, Eleni Constantinou et al.

Software projects are regularly updated with new functionality and bug fixes through so-called releases. In recent years, many software projects have been shifting to shorter release cycles and this can affect the bug handling activity. Past research has focused on the impact of switching from traditional to rapid release cycles with respect to bug handling activity, but the effect of the rapid release cycle duration has not yet been studied. We empirically investigate releases of 420 open source projects with rapid release cycles to understand the effect of variable and rapid release cycle durations on bug handling activity. We group the releases of these projects into five categories of release cycle durations. For each project, we investigate how the sequence of releases is related to bug handling activity metrics and we study the effect of the variability of cycle durations on bug fixing. Our results did not reveal any statistically significant difference for the studied bug handling activity metrics in the presence of variable rapid release cycle durations. This suggests that the duration of fast release cycles does not seem to impact bug handling activity.

SEJun 19, 2019Code
On the abandonment and survival of open source projects: An empirical investigation

Guilherme Avelino, Eleni Constantinou, Marco Tulio Valente et al.

Background: Evolution of open source projects frequently depends on a small number of core developers. The loss of such core developers might be detrimental for projects and even threaten their entire continuation. However, it is possible that new core developers assume the project maintenance and allow the project to survive. Aims: The objective of this paper is to provide empirical evidence on: 1) the frequency of project abandonment and survival, 2) the differences between abandoned and surviving projects, and 3) the motivation and difficulties faced when assuming an abandoned project. Method: We adopt a mixed-methods approach to investigate project abandonment and survival. We carefully select 1,932 popular GitHub projects and recover the abandoned and surviving projects, and conduct a survey with developers that have been instrumental in the survival of the projects. Results: We found that 315 projects (16%) were abandoned and 128 of these projects (41%) survived because of new core developers who assumed the project development. The survey indicates that (i) in most cases the new maintainers were aware of the project abandonment risks when they started to contribute; (ii) their own usage of the systems is the main motivation to contribute to such projects; (iii) human and social factors played a key role when making these contributions; and (iv) lack of time and the difficulty to obtain push access to the repositories are the main barriers faced by them. Conclusions: Project abandonment is a reality even in large open source projects and our work enables a better understanding of such risks, as well as highlights ways in avoiding them.

SEMar 10, 2021
Identifying bot activity in GitHub pull request and issue comments

Mehdi Golzadeh, Alexandre Decan, Eleni Constantinou et al.

Development bots are used on Github to automate repetitive activities. Such bots communicate with human actors via issue comments and pull request comments. Identifying such bot comments allows preventing bias in socio-technical studies related to software development. To automate their identification, we propose a classification model based on natural language processing. Starting from a balanced ground-truth dataset of 19,282 PR and issue comments, we encode the comments as vectors using a combination of the bag of words and TF-IDF techniques. We train a range of binary classifiers to predict the type of comment (human or bot) based on this vector representation. A multinomial Naive Bayes classifier provides the best results. Its performance on a test set containing 50% of the data achieves an average precision, recall, and F1 score of 0.88. Although the model shows a promising result on the pull request and issue comments, further work is required to generalize the model on other types of activities, like commit messages and code reviews.

SEDec 12, 2018
Breaking the borders: an investigation of cross-ecosystem software packages

Eleni Constantinou, Alexandre Decan, Tom Mens

Software ecosystems are collections of projects that are developed and evolve together in the same environment. Existing literature investigates software ecosystems as isolated entities whose boundaries do not overlap and assumes they are self-contained. However, a number of software projects are distributed in more than one ecosystem. As different aspects, e.g., success, security vulnerabilities, bugs, etc., of such cross-ecosystem packages can affect multiple ecosystems, we investigate the presence and characteristics of these cross-ecosystem packages in 12 large software distributions. We found a small number of packages distributed in multiple packaging ecosystems and that such packages are usually distributed in two ecosystems. These packages tend to better support with new releases certain ecosystems, while their evolution can impact a multitude of packages in other ecosystems. Finally, such packages appear to be popular with large developer communities.

SEJun 5, 2018
On the evolution of technical lag in the npm package dependency network

Alexandre Decan, Tom Mens, Eleni Constantinou

Software packages developed and distributed through package managers extensively depend on other packages. These dependencies are regularly updated, for example to add new features, resolve bugs or fix security issues. In order to take full advantage of the benefits of this type of reuse, developers should keep their dependencies up to date by relying on the latest releases. In practice, however, this is not always possible, and packages lag behind with respect to the latest version of their dependencies. This phenomenon is described as technical lag in the literature. In this paper, we perform an empirical study of technical lag in the npm dependency network by investigating its evolution for over 1.4M releases of 120K packages and 8M dependencies between these releases. We explore how technical lag increases over time, taking into account the release type and the use of package dependency constraints. We also discuss how technical lag can be reduced by relying on the semantic versioning policy.

SENov 18, 2017
Automatic link extraction: The good, the bad and the ugly in software ecosystem mining

Eleni Constantinou, Tom Mens

This abstract presents the automatic link extraction pitfalls based on our experience on manually investigating links in the RubyGems package manager metadata. This work can lead in automating the link extraction approach so as to avoid these pitfalls and produce more complete datasets to be used by researchers when they investigate the multi-platform evolution of software ecosystems.

SEAug 8, 2017
An Empirical Comparison of Developer Retention in the RubyGems and npm Software Ecosystems

Eleni Constantinou, Tom Mens

Software ecosystems can be viewed as socio-technical networks consisting of technical components (software packages) and social components (communities of developers) that maintain the technical components. Ecosystems evolve over time through socio-technical changes that may greatly impact the ecosystem's sustainability. Social changes like developer turnover may lead to technical degradation. This motivates the need to identify those factors leading to developer abandonment, in order to automate the process of identifying developers with high abandonment risk. This paper compares such factors for two software package ecosystems, RubyGems and npm. We analyse the evolution of their packages hosted on GitHub, considering development activity in terms of commits, and social interaction with other developers in terms of comments associated to commits, issues or pull requests. We analyse this socio-technical activity for more than 30k and 60k developers for RubyGems and npm respectively. We use survival analysis to identify which factors coincide with a lower survival probability. Our results reveal that developers with a higher probability to abandon an ecosystem: do not engage in discussions with other developers; do not have strong social and technical activity intensity; communicate or commit less frequently; and do not participate to both technical and social activities for long periods of time. Such observations could be used to automate the identification of developers with a high probability of abandoning the ecosystem and, as such, reduce the risks associated to knowledge loss.