LGAug 1, 2025

Diffusion-Scheduled Denoising Autoencoders for Anomaly Detection in Tabular Data

Timur Sattarov, Marco Schreyer, Damian Borth

arXiv:2508.00758v19.42 citationsh-index: 12KDD

Originality Incremental advance

AI Analysis

This addresses the problem of detecting anomalies in tabular data for applications where anomalous examples are scarce, representing an incremental advancement over existing autoencoder and diffusion methods.

The paper tackled anomaly detection in tabular data by proposing the Diffusion-Scheduled Denoising Autoencoder (DDAE), which integrates diffusion-based noise scheduling and contrastive learning, resulting in improvements of up to 65% in PR-AUC and 16% in ROC-AUC over state-of-the-art baselines on 57 datasets.

Anomaly detection in tabular data remains challenging due to complex feature interactions and the scarcity of anomalous examples. Denoising autoencoders rely on fixed-magnitude noise, limiting adaptability to diverse data distributions. Diffusion models introduce scheduled noise and iterative denoising, but lack explicit reconstruction mappings. We propose the Diffusion-Scheduled Denoising Autoencoder (DDAE), a framework that integrates diffusion-based noise scheduling and contrastive learning into the encoding process to improve anomaly detection. We evaluated DDAE on 57 datasets from ADBench. Our method outperforms in semi-supervised settings and achieves competitive results in unsupervised settings, improving PR-AUC by up to 65% (9%) and ROC-AUC by 16% (6%) over state-of-the-art autoencoder (diffusion) model baselines. We observed that higher noise levels benefit unsupervised training, while lower noise with linear scheduling is optimal in semi-supervised settings. These findings underscore the importance of principled noise strategies in tabular anomaly detection.

View on arXiv PDF

Similar