Learning Robust Diffusion Models from Imprecise Supervision
This addresses a common issue in generative AI where datasets often contain imperfect labels, offering a solution for more robust model training, though it is an incremental improvement within diffusion models.
The paper tackles the problem of training conditional diffusion models with imprecise supervision, such as noisy or ambiguous labels, which degrade generation quality, and proposes DMIS, a framework that improves sample quality and class discrimination across various tasks.
Conditional diffusion models have achieved remarkable success in various generative tasks recently, but their training typically relies on large-scale datasets that inevitably contain imprecise information in conditional inputs. Such supervision, often stemming from noisy, ambiguous, or incomplete labels, will cause condition mismatch and degrade generation quality. To address this challenge, we propose DMIS, a unified framework for training robust Diffusion Models from Imprecise Supervision, which is the first systematic study within diffusion models. Our framework is derived from likelihood maximization and decomposes the objective into generative and classification components: the generative component models imprecise-label distributions, while the classification component leverages a diffusion classifier to infer class-posterior probabilities, with its efficiency further improved by an optimized timestep sampling strategy. Extensive experiments on diverse forms of imprecise supervision, covering tasks of image generation, weakly supervised learning, and noisy dataset condensation demonstrate that DMIS consistently produces high-quality and class-discriminative samples.