LGAIFeb 2

Sparsely Supervised Diffusion

arXiv:2602.02699v1
Originality Incremental advance
AI Analysis

This addresses a specific issue in generative AI for researchers and practitioners, but it is incremental as it builds on existing diffusion model frameworks.

The paper tackles the problem of spatially inconsistent generation in diffusion models by proposing a sparsely supervised learning method with a masking strategy, which achieves competitive FID scores and avoids training instability on small datasets.

Diffusion models have shown remarkable success across a wide range of generative tasks. However, they often suffer from spatially inconsistent generation, arguably due to the inherent locality of their denoising mechanisms. This can yield samples that are locally plausible but globally inconsistent. To mitigate this issue, we propose sparsely supervised learning for diffusion models, a simple yet effective masking strategy that can be implemented with only a few lines of code. Interestingly, the experiments show that it is safe to mask up to 98\% of pixels during diffusion model training. Our method delivers competitive FID scores across experiments and, most importantly, avoids training instability on small datasets. Moreover, the masking strategy reduces memorization and promotes the use of essential contextual information during generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes