Mask-Guided Attention Regulation for Anatomically Consistent Counterfactual CXR Synthesis
This work is significant for medical imaging researchers and clinicians, as it enables more reliable and anatomically consistent counterfactual CXR synthesis, which can support localized counterfactual analysis and data augmentation for downstream tasks.
This paper addresses the challenge of structural drift and unstable pathology expression in diffusion-based counterfactual chest X-ray (CXR) synthesis. They propose an inference-time attention regulation framework that uses organ masks to confine structural interactions and enhances pathology-token cross-attention, resulting in improved anatomical consistency and more precise, controllable pathological edits compared to standard diffusion editing.
Counterfactual generation for chest X-rays (CXR) aims to simulate plausible pathological changes while preserving patient-specific anatomy. However, diffusion-based editing methods often suffer from structural drift, where stable anatomical semantics propagate globally through attention and distort non-target regions, and unstable pathology expression, since subtle and localized lesions induce weak and noisy conditioning signals. We present an inference-time attention regulation framework for reliable counterfactual CXR synthesis. An anatomy-aware attention regularization module gates self-attention and anatomy-token cross-attention with organ masks, confining structural interactions to anatomical ROIs and reducing unintended distortions. A pathology-guided module enhances pathology-token cross-attention within target lung regions during early denoising and performs lightweight latent corrections driven by an attention-concentration energy, enabling controllable lesion localization and extent. Extensive evaluations on CXR datasets show improved anatomical consistency and more precise, controllable pathological edits compared with standard diffusion editing, supporting localized counterfactual analysis and data augmentation for downstream tasks.