Code Ownership in Open-Source AI Software Security
This work addresses security measurement for developers and curators of open-source AI software, but it is incremental as it applies existing code ownership concepts to a new domain.
The paper investigated the relationship between code ownership metrics and vulnerabilities in open-source AI software projects, finding that high-level ownership correlates with a decrease in vulnerabilities. It introduced time-based metrics to categorize project phases and vulnerability intensities, and developed a Python tool for project evaluation.
As open-source AI software projects become an integral component in the AI software development, it is critical to develop a novel methods to ensure and measure the security of the open-source projects for developers. Code ownership, pivotal in the evolution of such projects, offers insights into developer engagement and potential vulnerabilities. In this paper, we leverage the code ownership metrics to empirically investigate the correlation with the latent vulnerabilities across five prominent open-source AI software projects. The findings from the large-scale empirical study suggest a positive relationship between high-level ownership (characterised by a limited number of minor contributors) and a decrease in vulnerabilities. Furthermore, we innovatively introduce the time metrics, anchored on the project's duration, individual source code file timelines, and the count of impacted releases. These metrics adeptly categorise distinct phases of open-source AI software projects and their respective vulnerability intensities. With these novel code ownership metrics, we have implemented a Python-based command-line application to aid project curators and quality assurance professionals in evaluating and benchmarking their on-site projects. We anticipate this work will embark a continuous research development for securing and measuring open-source AI project security.