LGAug 15, 2024

A Systematic Evaluation of Generated Time Series and Their Effects in Self-Supervised Pretraining

arXiv:2408.07869v11 citationsh-index: 26
Originality Incremental advance
AI Analysis

This addresses data scarcity in time series analysis for researchers, though it is incremental as it builds on existing generation methods.

The study tackled the underperformance of self-supervised pretrained models (PTMs) for time series data compared to supervised models, likely due to data scarcity, by testing six generation methods and found that using more generated samples instead of real data in pretraining improves classification performance.

Self-supervised Pretrained Models (PTMs) have demonstrated remarkable performance in computer vision and natural language processing tasks. These successes have prompted researchers to design PTMs for time series data. In our experiments, most self-supervised time series PTMs were surpassed by simple supervised models. We hypothesize this undesired phenomenon may be caused by data scarcity. In response, we test six time series generation methods, use the generated data in pretraining in lieu of the real data, and examine the effects on classification performance. Our results indicate that replacing a real-data pretraining set with a greater volume of only generated samples produces noticeable improvement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes