LGSTMLOct 2, 2025

Diffusion Transformers for Imputation: Statistical Efficiency and Uncertainty Quantification

arXiv:2510.02216v1h-index: 3
Originality Incremental advance
AI Analysis

This work addresses a gap in theoretical understanding for diffusion-based imputation methods, which is important for researchers and practitioners dealing with incomplete time-series data.

The paper tackles the theoretical understanding of diffusion-based generative imputation methods for time-series data with missing values, deriving statistical sample complexity bounds and constructing tight confidence regions for missing values, with findings showing imputation efficiency and accuracy are significantly influenced by missing patterns.

Imputation methods play a critical role in enhancing the quality of practical time-series data, which often suffer from pervasive missing values. Recently, diffusion-based generative imputation methods have demonstrated remarkable success compared to autoregressive and conventional statistical approaches. Despite their empirical success, the theoretical understanding of how well diffusion-based models capture complex spatial and temporal dependencies between the missing values and observed ones remains limited. Our work addresses this gap by investigating the statistical efficiency of conditional diffusion transformers for imputation and quantifying the uncertainty in missing values. Specifically, we derive statistical sample complexity bounds based on a novel approximation theory for conditional score functions using transformers, and, through this, construct tight confidence regions for missing values. Our findings also reveal that the efficiency and accuracy of imputation are significantly influenced by the missing patterns. Furthermore, we validate these theoretical insights through simulation and propose a mixed-masking training strategy to enhance the imputation performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes