CVFeb 24, 2025

Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation

arXiv:2502.16872v111 citationsh-index: 44
Originality Incremental advance
AI Analysis

This addresses the issue of unreliable image generation in diffusion models for users in fields like computer vision and AI, though it is an incremental improvement.

The paper tackled the problem of hallucinations in diffusion models, which cause unrealistic features, and proposed Adaptive Attention Modulation (AAM) to mitigate them, resulting in a 20.8% improvement in FID score and a 12.9% reduction in hallucinated images on the Hands dataset.

Diffusion models, while increasingly adept at generating realistic images, are notably hindered by hallucinations -- unrealistic or incorrect features inconsistent with the trained data distribution. In this work, we propose Adaptive Attention Modulation (AAM), a novel approach to mitigate hallucinations by analyzing and modulating the self-attention mechanism in diffusion models. We hypothesize that self-attention during early denoising steps may inadvertently amplify or suppress features, contributing to hallucinations. To counter this, AAM introduces a temperature scaling mechanism within the softmax operation of the self-attention layers, dynamically modulating the attention distribution during inference. Additionally, AAM employs a masked perturbation technique to disrupt early-stage noise that may otherwise propagate into later stages as hallucinations. Extensive experiments demonstrate that AAM effectively reduces hallucinatory artifacts, enhancing both the fidelity and reliability of generated images. For instance, the proposed approach improves the FID score by 20.8% and reduces the percentage of hallucinated images by 12.9% (in absolute terms) on the Hands dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes