Building a Multivariate Time Series Benchmarking Datasets Inspired by Natural Language Processing (NLP)
This work addresses the problem of limited benchmarking resources for researchers and practitioners in time series modeling, though it is incremental by adapting existing NLP strategies to a new domain.
The paper tackles the lack of high-quality benchmark datasets for time series analysis by proposing a new approach inspired by NLP benchmark creation methods, resulting in a comprehensive dataset designed to enhance model performance through multi-task learning strategies.
Time series analysis has become increasingly important in various domains, and developing effective models relies heavily on high-quality benchmark datasets. Inspired by the success of Natural Language Processing (NLP) benchmark datasets in advancing pre-trained models, we propose a new approach to create a comprehensive benchmark dataset for time series analysis. This paper explores the methodologies used in NLP benchmark dataset creation and adapts them to the unique challenges of time series data. We discuss the process of curating diverse, representative, and challenging time series datasets, highlighting the importance of domain relevance and data complexity. Additionally, we investigate multi-task learning strategies that leverage the benchmark dataset to enhance the performance of time series models. This research contributes to the broader goal of advancing the state-of-the-art in time series modeling by adopting successful strategies from the NLP domain.