LGAINov 9, 2024

A Picture is Worth A Thousand Numbers: Enabling LLMs Reason about Time Series via Visualization

arXiv:2411.06018v235 citationsh-index: 21NAACL
Originality Incremental advance
AI Analysis

This work addresses the challenge of enabling LLMs to reason about time-series data, which is incremental as it builds on existing multimodal LLM capabilities with a novel prompting approach.

The authors tackled the problem of large language models (LLMs) underperforming in time-series reasoning (TsR) by proposing VL-Time, a prompt-based solution using visualization-modeled data, which achieved about 140% average performance improvement and 99% average token cost reduction.

Large language models (LLMs), with demonstrated reasoning abilities across multiple domains, are largely underexplored for time-series reasoning (TsR), which is ubiquitous in the real world. In this work, we propose TimerBed, the first comprehensive testbed for evaluating LLMs' TsR performance. Specifically, TimerBed includes stratified reasoning patterns with real-world tasks, comprehensive combinations of LLMs and reasoning strategies, and various supervised models as comparison anchors. We perform extensive experiments with TimerBed, test multiple current beliefs, and verify the initial failures of LLMs in TsR, evidenced by the ineffectiveness of zero shot (ZST) and performance degradation of few shot in-context learning (ICL). Further, we identify one possible root cause: the numerical modeling of data. To address this, we propose a prompt-based solution VL-Time, using visualization-modeled data and language-guided reasoning. Experimental results demonstrate that Vl-Time enables multimodal LLMs to be non-trivial ZST and powerful ICL reasoners for time series, achieving about 140% average performance improvement and 99% average token costs reduction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes