Mining Container Image Repositories for Software Configuration and Beyond
This work addresses the need for better software deployment insights for developers and researchers, but it is incremental as it introduces a new data source without presenting novel methods or results.
The paper tackles the problem of extracting software configuration and deployment information by mining container image repositories, which encapsulate the entire execution ecosystem, and highlights opportunities for software engineering tasks while summarizing challenges and approaches to facilitate future research.
This paper introduces the idea of mining container image repositories for configuration and other deployment information of software systems. Unlike traditional software repositories (e.g., source code repositories and app stores), image repositories encapsulate the entire execution ecosystem for running target software, including its configurations, dependent libraries and components, and OS-level utilities, which contributes to a wealth of data and information. We showcase the opportunities based on concrete software engineering tasks that can benefit from mining image repositories. To facilitate future mining efforts, we summarize the challenges of analyzing image repositories and the approaches that can address these challenges. We hope that this paper will stimulate exciting research agenda of mining this emerging type of software repositories.