SEIRLGJul 11, 2022

Dev2vec: Representing Domain Expertise of Developers in an Embedding Space

arXiv:2207.05132v114 citationsh-index: 48
Originality Incremental advance
AI Analysis

This addresses the need for automated expertise assessment in software development for tasks like project assignment or hiring, though it is incremental as it adapts an existing method to a new context.

The paper tackles the problem of automatically assessing developers' domain expertise across multiple software projects by representing their expertise as embedding vectors using doc2vec, achieving a 21% improvement in F1-score over state-of-the-art methods.

Accurate assessment of the domain expertise of developers is important for assigning the proper candidate to contribute to a project or to attend a job role. Since the potential candidate can come from a large pool, the automated assessment of this domain expertise is a desirable goal. While previous methods have had some success within a single software project, the assessment of a developer's domain expertise from contributions across multiple projects is more challenging. In this paper, we employ doc2vec to represent the domain expertise of developers as embedding vectors. These vectors are derived from different sources that contain evidence of developers' expertise, such as the description of repositories that they contributed, their issue resolving history, and API calls in their commits. We name it dev2vec and demonstrate its effectiveness in representing the technical specialization of developers. Our results indicate that encoding the expertise of developers in an embedding vector outperforms state-of-the-art methods and improves the F1-score up to 21%. Moreover, our findings suggest that ``issue resolving history'' of developers is the most informative source of information to represent the domain expertise of developers in embedding spaces.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes