CVMar 6

DiffInf: Influence-Guided Diffusion for Supervision Alignment in Facial Attribute Learning

arXiv:2603.06399v17.3h-index: 3

Predicted impact top 70% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This work tackles the problem of supervision errors in facial attribute datasets, which impairs representation learning and degrades downstream prediction for researchers and practitioners working with such data, offering an incremental improvement over existing noisy-label training and robust optimization methods.

This paper addresses annotation inconsistencies in facial attribute datasets, where continuous traits like age and expression are discretized into categorical labels, leading to mismatches between images and labels. The authors propose DiffInf, a self-influence-guided diffusion framework that identifies and generatively corrects influential training instances to better align visual content with assigned labels, resulting in improved generalization in multi-class facial attribute classification.

Facial attribute classification relies on large-scale annotated datasets in which many traits, such as age and expression, are inherently ambiguous and continuous but are discretized into categorical labels. Annotation inconsistencies arise from subjectivity and visual confounders such as pose, illumination, expression, and demographic variation, creating mismatch between images and assigned labels. These inconsistencies introduce supervision errors that impair representation learning and degrade downstream prediction. We introduce DiffInf, a self-influence--guided diffusion framework for mitigating annotation inconsistencies in facial attribute learning. We first train a baseline classifier and compute sample-wise self-influence scores using a practical first-order approximation to identify training instances that disproportionately destabilize optimization. Instead of discarding these influential samples, we apply targeted generative correction via a latent diffusion autoencoder to better align visual content with assigned labels while preserving identity and realism. To enable differentiable guidance during correction, we train a lightweight predictor of high-influence membership and use it as a surrogate influence regularizer. The edited samples replace the originals, yielding an influence-refined dataset of unchanged size. Across multi-class facial attribute classification, DiffInf consistently improves generalization compared with standard noisy-label training, robust optimization baselines, and influence-based filtering. Our results demonstrate that repairing influential annotation inconsistencies at the image level enhances downstream facial attribute classification without sacrificing distributional coverage.

View on arXiv PDF

Similar