From Release to Adoption: Challenges in Reusing Pre-trained AI Models for Downstream Developers
For downstream developers and researchers, this work provides a taxonomy of challenges in reusing pre-trained models, highlighting the need for better support in model reuse.
This study analyzes 840 PTM-related issue reports from 31 open-source projects, identifying seven key categories of challenges downstream developers face when reusing pre-trained models. It finds that PTM-related issues take significantly longer to resolve than non-PTM issues, with variation across categories.
Pre-trained models (PTMs) have gained widespread popularity and achieved remarkable success across various fields, driven by their groundbreaking performance and easy accessibility through hosting providers. However, the challenges faced by downstream developers in reusing PTMs in software systems are less explored. To bridge this knowledge gap, we qualitatively created and analyzed a dataset of 840 PTM-related issue reports from 31 OSS GitHub projects. We systematically developed a comprehensive taxonomy of PTM-related challenges that developers face in downstream projects. Our study identifies seven key categories of challenges that downstream developers face in reusing PTMs, such as model usage, model performance, and output quality. We also compared our findings with existing taxonomies. Additionally, we conducted a resolution time analysis and, based on statistical tests, found that PTM-related issues take significantly longer to be resolved than issues unrelated to PTMs, with significant variation across challenge categories. We discuss the implications of our findings for practitioners and possibilities for future research.