Can LLMs Serve As Time Series Anomaly Detectors?
This addresses the problem of time series anomaly detection for applications in various real-world domains, but it is incremental as it builds on existing LLM capabilities with new prompting and fine-tuning methods.
The paper investigated whether large language models (LLMs) like GPT-4 and LLaMA3 can detect and explain anomalies in time series, finding that direct use fails but with prompt strategies, GPT-4 achieves competitive results, and instruction fine-tuning on a synthesized dataset improves LLaMA3's performance.
An emerging topic in large language models (LLMs) is their application to time series forecasting, characterizing mainstream and patternable characteristics of time series. A relevant but rarely explored and more challenging question is whether LLMs can detect and explain time series anomalies, a critical task across various real-world applications. In this paper, we investigate the capabilities of LLMs, specifically GPT-4 and LLaMA3, in detecting and explaining anomalies in time series. Our studies reveal that: 1) LLMs cannot be directly used for time series anomaly detection. 2) By designing prompt strategies such as in-context learning and chain-of-thought prompting, GPT-4 can detect time series anomalies with results competitive to baseline methods. 3) We propose a synthesized dataset to automatically generate time series anomalies with corresponding explanations. By applying instruction fine-tuning on this dataset, LLaMA3 demonstrates improved performance in time series anomaly detection tasks. In summary, our exploration shows the promising potential of LLMs as time series anomaly detectors.