LGJul 23, 2023

TabADM: Unsupervised Tabular Anomaly Detection with Diffusion Models

arXiv:2307.12336v16 citationsh-index: 44
Originality Highly original
AI Analysis

This work addresses the problem of identifying anomalies in contaminated tabular datasets for researchers and practitioners, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackled unsupervised anomaly detection in tabular data by proposing a diffusion-based probabilistic model that learns the density of normal samples using a rejection scheme to reduce anomaly influence, and demonstrated improved detection capabilities over baselines on real data.

Tables are an abundant form of data with use cases across all scientific fields. Real-world datasets often contain anomalous samples that can negatively affect downstream analysis. In this work, we only assume access to contaminated data and present a diffusion-based probabilistic model effective for unsupervised anomaly detection. Our model is trained to learn the density of normal samples by utilizing a unique rejection scheme to attenuate the influence of anomalies on the density estimation. At inference, we identify anomalies as samples in low-density regions. We use real data to demonstrate that our method improves detection capabilities over baselines. Furthermore, our method is relatively stable to the dimension of the data and does not require extensive hyperparameter tuning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes