Timelines for In-Code Discovery of Zero-Day Vulnerabilities and Supply-Chain Attacks
This work addresses the longevity of zero-day vulnerabilities for software security, but it is incremental as it builds on existing models to estimate discovery timelines.
The study tackled the problem of estimating how long zero-day vulnerabilities remain undiscovered in code by analyzing version-to-version changes in open-source software like Mozilla Firefox, GNU/Linux, and glibc, using data from over a billion lines of code across 87 versions to specify bounds for discoverability from expertly hidden to obvious vulnerabilities.
Zero-day vulnerabilities can be accidentally or maliciously placed in code and can remain in place for years. In this study, we address an aspect of their longevity by considering the likelihood that they will be discovered in the code across versions. We approximate well-disguised vulnerabilities as only being discoverable if the relevant lines of code are explicitly examined, and obvious vulnerabilities as being discoverable if any part of the relevant file is examined. We analyze the version-to-version changes in three types of open source software (Mozilla Firefox, GNU/Linus, and glibc) to understand the rate at which the various pieces of code are amended and find that much of the revision behavior can be captured with a simple intuitive model. We use that model and the data from over a billion unique lines of code in 87 different versions of software to specify the bounds for in-code discoverability of vulnerabilities - from expertly hidden to obviously observable.