LGAIApr 6

Extending Tabular Denoising Diffusion Probabilistic Models for Time-Series Data Generation

arXiv:2604.0525734.9h-index: 4
Predicted impact top 68% in LG · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the need for privacy-preserving data augmentation in time-series domains, such as sensor data, by enabling temporally coherent generation, though it is incremental as it builds on existing diffusion models.

The paper tackled the problem of generating synthetic time-series data by extending TabDDPM to handle temporal dependencies, resulting in synthetic sequences that closely resemble real sensor patterns with a macro F1-score of 0.64 and accuracy of 0.71 on the WISDM dataset.

Diffusion models are increasingly being utilised to create synthetic tabular and time series data for privacy-preserving augmentation. Tabular Denoising Diffusion Probabilistic Models (TabDDPM) generate high-quality synthetic data from heterogeneous tabular datasets but assume independence between samples, limiting their applicability to time-series domains where temporal dependencies are critical. To address this, we propose a temporal extension of TabDDPM, introducing sequence awareness through the use of lightweight temporal adapters and context-aware embedding modules. By reformulating sensor data into windowed sequences and explicitly modeling temporal context via timestep embeddings, conditional activity labels, and observed/missing masks, our approach enables the generation of temporally coherent synthetic sequences. Compared to baseline and interpolation techniques, validation using bigram transition matrices and autocorrelation analysis shows enhanced temporal realism, diversity, and coherence. On the WISDM accelerometer dataset, the suggested system produces synthetic time-series that closely resemble real world sensor patterns and achieves comparable classification performance (macro F1-score 0.64, accuracy 0.71). This is especially advantageous for minority class representation and preserving statistical alignment with real distributions. These developments demonstrate that diffusion based models provide effective and adaptable solutions for sequential data synthesis when they are equipped for temporal reasoning. Future work will explore scaling to longer sequences and integrating stronger temporal architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes