SafeFix: Targeted Model Repair via Controlled Image Generation
This work addresses the problem of improving model robustness for rare-case errors in visual recognition, representing an incremental advancement over existing debugging and repair methods.
The paper tackles systematic errors in deep learning models for visual recognition caused by underrepresented semantic subpopulations by introducing a model repair module that uses a conditional text-to-image model and a large vision-language model to generate and filter targeted synthetic images for retraining, significantly reducing errors associated with rare cases.
Deep learning models for visual recognition often exhibit systematic errors due to underrepresented semantic subpopulations. Although existing debugging frameworks can pinpoint these failures by identifying key failure attributes, repairing the model effectively remains difficult. Current solutions often rely on manually designed prompts to generate synthetic training images -- an approach prone to distribution shift and semantic errors. To overcome these challenges, we introduce a model repair module that builds on an interpretable failure attribution pipeline. Our approach uses a conditional text-to-image model to generate semantically faithful and targeted images for failure cases. To preserve the quality and relevance of the generated samples, we further employ a large vision-language model (LVLM) to filter the outputs, enforcing alignment with the original data distribution and maintaining semantic consistency. By retraining vision models with this rare-case-augmented synthetic dataset, we significantly reduce errors associated with rare cases. Our experiments demonstrate that this targeted repair strategy improves model robustness without introducing new bugs. Code is available at https://github.com/oxu2/SafeFix