SEIRLGSIOct 15, 2022

Code Recommendation for Open Source Software Developers

Georgia Tech
arXiv:2210.08332v328 citationsh-index: 28Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of matching developers to appropriate tasks in open source projects, which is incremental as it builds on existing recommendation methods with a novel graph-based approach.

The paper tackles the problem of recommending development tasks to open source software developers by predicting their future contributions, using a graph-based framework called CODER that models user-code and user-project interactions, achieving superior performance in intra-project, cross-project, and cold-start settings.

Open Source Software (OSS) is forming the spines of technology infrastructures, attracting millions of talents to contribute. Notably, it is challenging and critical to consider both the developers' interests and the semantic features of the project code to recommend appropriate development tasks to OSS developers. In this paper, we formulate the novel problem of code recommendation, whose purpose is to predict the future contribution behaviors of developers given their interaction history, the semantic features of source code, and the hierarchical file structures of projects. Considering the complex interactions among multiple parties within the system, we propose CODER, a novel graph-based code recommendation framework for open source software developers. CODER jointly models microscopic user-code interactions and macroscopic user-project interactions via a heterogeneous graph and further bridges the two levels of information through aggregation on file-structure graphs that reflect the project hierarchy. Moreover, due to the lack of reliable benchmarks, we construct three large-scale datasets to facilitate future research in this direction. Extensive experiments show that our CODER framework achieves superior performance under various experimental settings, including intra-project, cross-project, and cold-start recommendation. We will release all the datasets, code, and utilities for data retrieval upon the acceptance of this work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes