LGCVSep 25, 2025

A Unified Framework for Diffusion Model Unlearning with f-Divergence

arXiv:2509.21167v12 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses machine unlearning for text-to-image diffusion models, offering an incremental improvement by generalizing existing methods into a more flexible framework.

The paper tackles the problem of removing specific knowledge from diffusion models by proposing a unified framework based on f-divergences, showing that existing MSE-based methods are a special case and enabling flexible selection of divergences to balance unlearning and preservation.

Machine unlearning aims to remove specific knowledge from a trained model. While diffusion models (DMs) have shown remarkable generative capabilities, existing unlearning methods for text-to-image (T2I) models often rely on minimizing the mean squared error (MSE) between the output distribution of a target and an anchor concept. We show that this MSE-based approach is a special case of a unified $f$-divergence-based framework, in which any $f$-divergence can be utilized. We analyze the benefits of using different $f$-divergences, that mainly impact the convergence properties of the algorithm and the quality of unlearning. The proposed unified framework offers a flexible paradigm that allows to select the optimal divergence for a specific application, balancing different trade-offs between aggressive unlearning and concept preservation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes