CRLGSEFeb 4, 2023

Detecting Security Patches via Behavioral Data in Code Repositories

arXiv:2302.02112v11 citationsh-index: 42
AI Analysis

This addresses the issue of undisclosed vulnerabilities in software development for users and security analysts, presenting a novel approach.

The paper tackles the problem of detecting security patches in Git repositories without analyzing code or commit messages, using only developer behavior data, achieving 88.3% accuracy and 89.8% F1 score.

The absolute majority of software today is developed collaboratively using collaborative version control tools such as Git. It is a common practice that once a vulnerability is detected and fixed, the developers behind the software issue a Common Vulnerabilities and Exposures or CVE record to alert the user community of the security hazard and urge them to integrate the security patch. However, some companies might not disclose their vulnerabilities and just update their repository. As a result, users are unaware of the vulnerability and may remain exposed. In this paper, we present a system to automatically identify security patches using only the developer behavior in the Git repository without analyzing the code itself or the remarks that accompanied the fix (commit message). We showed we can reveal concealed security patches with an accuracy of 88.3% and F1 Score of 89.8%. This is the first time that a language-oblivious solution for this problem is presented.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes