AIDec 23, 2024

On the Feasibility of Vision-Language Models for Time-Series Classification

Vinay Prithyani, Mohsin Mohammed, Richa Gadgil, Ricardo Buitrago, Vinija Jain, Aman Chadha

arXiv:2412.17304v211.66 citationsh-index: 13Has Code

Originality Incremental advance

AI Analysis

This addresses time-series classification problems for researchers and practitioners, but it is incremental as it builds on existing VLM capabilities.

The paper tackles time-series classification by using Vision-Language Models (VLMs) with graphical data representations as images, achieving competitive results after two or fewer epochs of fine-tuning and handling univariate and multivariate data.

We build upon time-series classification by leveraging the capabilities of Vision Language Models (VLMs). We find that VLMs produce competitive results after two or less epochs of fine-tuning. We develop a novel approach that incorporates graphical data representations as images in conjunction with numerical data. This approach is rooted in the hypothesis that graphical representations can provide additional contextual information that numerical data alone may not capture. Additionally, providing a graphical representation can circumvent issues such as limited context length faced by LLMs. To further advance this work, we implemented a scalable end-to-end pipeline for training on different scenarios, allowing us to isolate the most effective strategies for transferring learning capabilities from LLMs to Time Series Classification (TSC) tasks. Our approach works with univariate and multivariate time-series data. In addition, we conduct extensive and practical experiments to show how this approach works for time-series classification and generative labels.

View on arXiv PDF Code

Similar