LG MLNov 20, 2018

T-CGAN: Conditional Generative Adversarial Network for Data Augmentation in Noisy Time Series with Irregular Sampling

Giorgia Ramponi, Pavlos Protopapas, Marco Brambilla, Ryan Janssen

arXiv:1811.08295v214.0118 citations

Originality Incremental advance

AI Analysis

This addresses the problem of limited data in time series analysis for researchers and practitioners, particularly in domains with irregular sampling, but it is incremental as it builds on existing CGAN methods.

The paper tackles data augmentation for noisy, irregularly-sampled time series by proposing T-CGAN, a conditional generative adversarial network conditioned on timestamps, and shows that classifiers trained on its generated data perform as well as those on real data for synthetic cases and outperform other augmentation methods like time slicing and warping for real-world datasets, especially with small training sets and short, noisy series.

In this paper we propose a data augmentation method for time series with irregular sampling, Time-Conditional Generative Adversarial Network (T-CGAN). Our approach is based on Conditional Generative Adversarial Networks (CGAN), where the generative step is implemented by a deconvolutional NN and the discriminative step by a convolutional NN. Both the generator and the discriminator are conditioned on the sampling timestamps, to learn the hidden relationship between data and timestamps, and consequently to generate new time series. We evaluate our model with synthetic and real-world datasets. For the synthetic data, we compare the performance of a classifier trained with T-CGAN-generated data, against the performance of the same classifier trained on the original data. Results show that classifiers trained on T-CGAN-generated data perform the same as classifiers trained on real data, even with very short time series and small training sets. For the real world datasets, we compare our method with other techniques of data augmentation for time series, such as time slicing and time warping, over a classification problem with unbalanced datasets. Results show that our method always outperforms the other approaches, both in case of regularly sampled and irregularly sampled time series. We achieve particularly good performance in case with a small training set and short, noisy, irregularly-sampled time series.

View on arXiv PDF

Similar