SEApr 20

Cache-Related Smells in GitLab CI/CD: Comprehensive Catalog, Automated Detection, and Empirical Evidence

arXiv:2604.1789046.6h-index: 15Has Code
AI Analysis

This work provides a systematic catalog and automated detection tool for cache misconfigurations in GitLab CI/CD, addressing a gap in prior research that focused only on missing dependency caches.

The paper presents a catalog of ten cache-related smells in GitLab CI/CD that degrade performance and reliability, and proposes CROSSER, a tool that detects seven of them with an F1 score of 0.98. Empirical analysis of 228 projects reveals that only 11% are free of these smells, indicating widespread issues.

Continuous Integration and Deployment (CI/CD) facilitate rapid software delivery, making fast feedback and minimal downtime essential. While caching has been shown to be an effective technique for tackling pipeline performance and reliability issues, existing works have primarily focused on missing dependency caches, ignoring other types of caches and cache misconfigurations. In this paper, we present a comprehensive catalog of ten cache-related smells in GitLab CI/CD that negatively impact performance and reliability, validated on a corpus of grey literature. To address the smells, we propose CROSSER, a tool that automatically detects seven of the ten smells. We evaluate CROSSER on a manually labeled dataset of 82 mature projects, achieving an overall F1 score of 0.98. Finally, we investigate the presence of smells across a large dataset of 228 mature open-source projects and outline our empirical findings. Our results show a widespread frequency of the smells, as only 11% of the projects do not present any. We also show that developers may not be aware of higher-level caching functionalities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes