Juri Di Rocco

SE
5papers
92citations
Novelty38%
AI Score26

5 Papers

SEMay 19, 2022
GitRanking: A Ranking of GitHub Topics for Software Classification using Active Sampling

Cezar Sas, Andrea Capiluppi, Claudio Di Sipio et al.

GitHub is the world's largest host of source code, with more than 150M repositories. However, most of these repositories are not labeled or inadequately so, making it harder for users to find relevant projects. There have been various proposals for software application domain classification over the past years. However, these approaches lack a well-defined taxonomy that is hierarchical, grounded in a knowledge base, and free of irrelevant terms. This work proposes GitRanking, a framework for creating a classification ranked into discrete levels based on how general or specific their meaning is. We collected 121K topics from GitHub and considered $60\%$ of the most frequent ones for the ranking. GitRanking 1) uses active sampling to ensure a minimal number of required annotations; and 2) links each topic to Wikidata, reducing ambiguities and improving the reusability of the taxonomy. Our results show that developers, when annotating their projects, avoid using terms with a high degree of specificity. This makes the finding and discovery of their projects more challenging for other users. Furthermore, we show that GitRanking can effectively rank terms according to their general or specific meaning. This ranking would be an essential asset for developers to build upon, allowing them to complement their annotations with more precise topics. Finally, we show that GitRanking is a dynamically extensible method: it can currently accept further terms to be ranked with a minimum number of annotations ($\sim$ 15). This paper is the first collective attempt to build a ground-up taxonomy of software domains.

SEMar 11, 2021Code
Development of recommendation systems for software engineering: the CROSSMINER experience

Juri Di Rocco, Davide Di Ruscio, Claudio Di Sipio et al.

To perform their daily tasks, developers intensively make use of existing resources by consulting open-source software (OSS) repositories. Such platforms contain rich data sources, e.g., code snippets, documentation, and user discussions, that can be useful for supporting development activities. Over the last decades, several techniques and tools have been promoted to provide developers with innovative features, aiming to bring in improvements in terms of development effort, cost savings, and productivity. In the context of the EU H2020 CROSSMINER project, a set of recommendation systems has been conceived to assist software programmers in different phases of the development process. The systems provide developers with various artifacts, such as third-party libraries, documentation about how to use the APIs being adopted, or relevant API function calls. To develop such recommendations, various technical choices have been made to overcome issues related to several aspects including the lack of baselines, limited data availability, decisions about the performance measures, and evaluation approaches. This paper is an experience report to present the knowledge pertinent to the set of recommendation systems developed through the CROSSMINER project. We explain in detail the challenges we had to deal with, together with the related lessons learned when developing and evaluating these systems. Our aim is to provide the research community with concrete takeaway messages that are expected to be useful for those who want to develop or customize their own recommendation systems. The reported experiences can facilitate interesting discussions and research work, which in the end contribute to the advancement of recommendation systems applied to solve different issues in Software Engineering.

SEFeb 15, 2021Code
Recommending API Function Calls and Code Snippets to Support Software Development

Phuong T. Nguyen, Juri Di Rocco, Claudio Di Sipio et al.

Software development activity has reached a high degree of complexity, guided by the heterogeneity of the components, data sources, and tasks. The proliferation of open-source software (OSS) repositories has stressed the need to reuse available software artifacts efficiently. To this aim, it is necessary to explore approaches to mine data from software repositories and leverage it to produce helpful recommendations. We designed and implemented FOCUS as a novel approach to provide developers with API calls and source code while they are programming. The system works on the basis of a context-aware collaborative filtering technique to extract API usages from OSS projects. In this work, we show the suitability of FOCUS for Android programming by evaluating it on a dataset of 2,600 mobile apps. The empirical evaluation results show that our approach outperforms two state-of-the-art API recommenders, UP-Miner and PAM, in terms of prediction accuracy. We also point out that there is no significant relationship between the categories for apps defined in Google Play and their API usages. Finally, we show that participants of a user study positively perceive the API and source code recommended by FOCUS as relevant to the current development context.

SEJan 20, 2022
Providing Upgrade Plans for Third-party Libraries: A Recommender System using Migration Graphs

Riccardo Rubei, Davide Di Ruscio, Claudio Di Sipio et al.

During the development of a software project, developers often need to upgrade third-party libraries (TPLs), aiming to keep their code up-to-date with the newest functionalities offered by the used libraries. In most cases, upgrading used TPLs is a complex and error-prone activity that must be carefully carried out to limit the ripple effects on the software project that depends on the libraries being upgraded. In this paper, we propose EvoPlan as a novel approach to the recommendation of different upgrade plans given a pair of library-version as input. In particular, among the different paths that can be possibly followed to upgrade the current library version to the desired updated one, EvoPlan is able to suggest the plan that can potentially minimize the efforts being needed to migrate the code of the clients from the library's current release to the target one. The approach has been evaluated on a curated dataset using conventional metrics used in Information Retrieval, i.e., precision, recall, and F-measure. The experimental results show that EvoPlan obtains an encouraging prediction performance considering two different criteria in the plan specification, i.e., the popularity of migration paths and the number of open and closed issues in GitHub for those projects that have already followed the recommended migration paths.

SENov 29, 2021
Enhancing syntax expressiveness in domain-specific modelling

Damiano Di Vicenzo, Juri Di Rocco, Davide Di Ruscio et al.

Domain-specific modelling helps tame the complexity of today's application domains by formalizing concepts and their relationships in modelling languages. While meta-editors are widely-used frameworks for implementing graphical editors for such modelling languages, they are best suitable for defining {novel} topological notations, i.e., syntaxes where the model layout does not contribute to the model semantics. In contrast, many engineering fields, e.g., railways systems or electrical engineering, use notations that, on the one hand, are standard and, on the other hand, are demanding more expressive power than topological syntaxes. In this paper, we discuss the problem of enhancing the expressiveness of modelling editors towards geometric/positional syntaxes. Several potential solutions are experimentally implemented on the jjodel web-based platform with the aim of identifying challenges and opportunities.