Tracing Vulnerable Code Lineage
This addresses the issue of vulnerability propagation for software security practitioners, but it is incremental as it builds on existing infrastructure and methods.
The paper tackles the problem of tracking the spread of known security vulnerabilities across open source software repositories by identifying file-level code duplication, using the World of Code infrastructure to analyze a nearly complete collection of open source projects.
This paper presents results from the MSR 2021 Hackathon. Our team investigates files/projects that contain known security vulnerabilities and how widespread they are throughout repositories in open source software. These security vulnerabilities can potentially be propagated through code reuse even when the vulnerability is fixed in different versions of the code. We utilize the World of Code infrastructure to discover file-level duplication of code from a nearly complete collection of open source software. This paper describes a method and set of tools to find all open source projects that use known vulnerable files and any previous revisions of those files.