CVMar 31, 2023

Trade-offs in Fine-tuned Diffusion Models Between Accuracy and Interpretability

arXiv:2303.17908v213 citationsh-index: 40Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable generative models in medical applications, where understanding model decisions is critical, though it is incremental in highlighting a specific trade-off.

The study investigates the trade-off between image fidelity and interpretability in fine-tuned diffusion models for medical imaging, finding that using learnable text encoders reduces interpretability, and proposes design principles to address this issue.

Recent advancements in diffusion models have significantly impacted the trajectory of generative machine learning research, with many adopting the strategy of fine-tuning pre-trained models using domain-specific text-to-image datasets. Notably, this method has been readily employed for medical applications, such as X-ray image synthesis, leveraging the plethora of associated radiology reports. Yet, a prevailing concern is the lack of assurance on whether these models genuinely comprehend their generated content. With the evolution of text-conditional image generation, these models have grown potent enough to facilitate object localization scrutiny. Our research underscores this advancement in the critical realm of medical imaging, emphasizing the crucial role of interpretability. We further unravel a consequential trade-off between image fidelity as gauged by conventional metrics and model interpretability in generative diffusion models. Specifically, the adoption of learnable text encoders when fine-tuning results in diminished interpretability. Our in-depth exploration uncovers the underlying factors responsible for this divergence. Consequently, we present a set of design principles for the development of truly interpretable generative models. Code is available at https://github.com/MischaD/chest-distillation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes