LG CVMar 2, 2024

Training Unbiased Diffusion Models From Biased Dataset

Yeongmin Kim, Byeonghu Na, Minsang Park, JoonHo Jang, Dongjun Kim, Wanmo Kang, Il-Chul Moon

arXiv:2403.01189v122.440 citationsh-index: 11Has CodeICLR

Originality Incremental advance

AI Analysis

This addresses bias mitigation in generative models, which is important for improving sample quality and fairness, though it appears incremental as an enhancement to existing reweighting methods.

The paper tackles dataset bias in diffusion models by proposing time-dependent importance reweighting, which outperforms baselines on datasets like CIFAR-10 and CelebA under various bias settings.

With significant advancements in diffusion models, addressing the potential risks of dataset bias becomes increasingly important. Since generated outputs directly suffer from dataset bias, mitigating latent bias becomes a key factor in improving sample quality and proportion. This paper proposes time-dependent importance reweighting to mitigate the bias for the diffusion models. We demonstrate that the time-dependent density ratio becomes more precise than previous approaches, thereby minimizing error propagation in generative learning. While directly applying it to score-matching is intractable, we discover that using the time-dependent density ratio both for reweighting and score correction can lead to a tractable form of the objective function to regenerate the unbiased data density. Furthermore, we theoretically establish a connection with traditional score-matching, and we demonstrate its convergence to an unbiased distribution. The experimental evidence supports the usefulness of the proposed method, which outperforms baselines including time-independent importance reweighting on CIFAR-10, CIFAR-100, FFHQ, and CelebA with various bias settings. Our code is available at https://github.com/alsdudrla10/TIW-DSM.

View on arXiv PDF Code

Similar