LGMar 23
CoRA: Boosting Time Series Foundation Models for Multivariate Forecasting through Correlation-aware AdapterHanyin Cheng, Xingjian Wu, Yang Shu et al.
Most existing Time Series Foundation Models (TSFMs) use channel independent modeling and focus on capturing and generalizing temporal dependencies, while neglecting the correlations among channels or overlooking the different aspects of correlations. However, these correlations play a vital role in Multivariate time series forecasting. To address this, we propose a CoRrelation-aware Adapter (CoRA), a lightweight plug-and-play method that requires only fine-tuning with TSFMs and is able to capture different types of correlations, so as to improve forecast performance. Specifically, to reduce complexity, we innovatively decompose the correlation matrix into low-rank Time-Varying and Time-Invariant components. For the Time-Varying component, we further design learnable polynomials to learn dynamic correlations by capturing trends or periodic patterns. To learn positive and negative correlations that appear only among some channels, we introduce a novel dual contrastive learning method that identifies correlations through projection layers, regulated by a Heterogeneous-Partial contrastive loss during training, without introducing additional complexity in the inference stage. Extensive experiments on 10 real-world datasets demonstrate that CoRA can improve TSFMs in multivariate forecasting performance.
LGFeb 15, 2025Code
A Comprehensive Survey of Deep Learning for Multivariate Time Series Forecasting: A Channel Strategy PerspectiveXiangfei Qiu, Hanyin Cheng, Xingjian Wu et al.
Multivariate Time Series Forecasting (MTSF) plays a crucial role across diverse fields, ranging from economic, energy, to traffic. In recent years, deep learning has demonstrated outstanding performance in MTSF tasks. In MTSF, modeling the correlations among different channels is critical, as leveraging information from other related channels can significantly improve the prediction accuracy of a specific channel. This study systematically reviews the channel modeling strategies for time series and proposes a taxonomy organized into three hierarchical levels: the strategy perspective, the mechanism perspective, and the characteristic perspective. On this basis, we provide a structured analysis of these methods and conduct an in-depth examination of the advantages and limitations of different channel strategies. Finally, we summarize and discuss some future research directions to provide useful research guidance. Moreover, we maintain an up-to-date Github repository (https://github.com/decisionintelligence/CS4TS) which includes all the papers discussed in the survey.
LGNov 27, 2024Code
MM-Path: Multi-modal, Multi-granularity Path Representation Learning -- Extended VersionRonghui Xu, Hanyin Cheng, Chenjuan Guo et al.
Developing effective path representations has become increasingly essential across various fields within intelligent transportation. Although pre-trained path representation learning models have shown improved performance, they predominantly focus on the topological structures from single modality data, i.e., road networks, overlooking the geometric and contextual features associated with path-related images, e.g., remote sensing images. Similar to human understanding, integrating information from multiple modalities can provide a more comprehensive view, enhancing both representation accuracy and generalization. However, variations in information granularity impede the semantic alignment of road network-based paths (road paths) and image-based paths (image paths), while the heterogeneity of multi-modal data poses substantial challenges for effective fusion and utilization. In this paper, we propose a novel Multi-modal, Multi-granularity Path Representation Learning Framework (MM-Path), which can learn a generic path representation by integrating modalities from both road paths and image paths. To enhance the alignment of multi-modal data, we develop a multi-granularity alignment strategy that systematically associates nodes, road sub-paths, and road paths with their corresponding image patches, ensuring the synchronization of both detailed local information and broader global contexts. To address the heterogeneity of multi-modal data effectively, we introduce a graph-based cross-modal residual fusion component designed to comprehensively fuse information across different modalities and granularities. Finally, we conduct extensive experiments on two large-scale real-world datasets under two downstream tasks, validating the effectiveness of the proposed MM-Path. The code is available at: https://github.com/decisionintelligence/MM-Path.
LGOct 16, 2025Code
Enhancing Time Series Forecasting through Selective Representation Spaces: A Patch PerspectiveXingjian Wu, Xiangfei Qiu, Hanyin Cheng et al.
Time Series Forecasting has made significant progress with the help of Patching technique, which partitions time series into multiple patches to effectively retain contextual semantic information into a representation space beneficial for modeling long-term dependencies. However, conventional patching partitions a time series into adjacent patches, which causes a fixed representation space, thus resulting in insufficiently expressful representations. In this paper, we pioneer the exploration of constructing a selective representation space to flexibly include the most informative patches for forecasting. Specifically, we propose the Selective Representation Space (SRS) module, which utilizes the learnable Selective Patching and Dynamic Reassembly techniques to adaptively select and shuffle the patches from the contextual time series, aiming at fully exploiting the information of contextual time series to enhance the forecasting performance of patch-based models. To demonstrate the effectiveness of SRS module, we propose a simple yet effective SRSNet consisting of SRS and an MLP head, which achieves state-of-the-art performance on real-world datasets from multiple domains. Furthermore, as a novel plugin-and-play module, SRS can also enhance the performance of existing patch-based models. The resources are available at https://github.com/decisionintelligence/SRSNet.
LGOct 15, 2024
TSFM-Bench: A Comprehensive and Unified Benchmark of Foundation Models for Time Series ForecastingZhe Li, Xiangfei Qiu, Peng Chen et al.
Time Series Forecasting (TSF) is key functionality in numerous fields, such as financial investment, weather services, and energy management. Although increasingly capable TSF methods occur, many of them require domain-specific data collection and model training and do not generalize well when applied in other domains. Time Series Foundation Models (TSFMs) that are pre-trained on massive heterogeneous time series data aim to overcome these limitations. The prospects for generalizability have spurred the development of a new generation of TSFMs. This study proposes a benchmark, TSFM-Bench, to facilitate comprehensive and unified evaluation of TSFMs. TSFM-Bench covers a wide range of TSFMs, including those based on large language models and those pre-trained on time series data. TSFM-Bench supports multiple forecasting scenarios, including zero-shot, few-shot, and full-shot, enabling assessment across the full range of adaptation strategies. TSFM-Bench also provides a standardized experimental protocols for critical evaluation processes such as dataset splitting, loading, normalization, and few-shot sampling, facilitating consistency and fairness. We report on an extensive evaluation of TSFMs across a diverse range of datasets spanning multiple domains and exhibiting varied statistical characteristics. Specifically, we identify pros and cons and inherent limitations of existing TSFMs, and we propose potential directions for new model designs.
LGSep 18, 2025
DAG: A Dual Causal Network for Time Series Forecasting with Exogenous VariablesXiangfei Qiu, Yuhan Zhu, Zhengyu Li et al.
Time series forecasting is crucial in various fields such as economics, traffic, and AIOps. However, in real-world applications, focusing solely on the endogenous variables (i.e., target variables), is often insufficient to ensure accurate predictions. Considering exogenous variables (i.e., covariates) provides additional predictive information, thereby improving forecasting accuracy. However, existing methods for time series forecasting with exogenous variables (TSF-X) have the following shortcomings: 1) they do not leverage future exogenous variables, 2) they fail to account for the causal relationships between endogenous and exogenous variables. As a result, their performance is suboptimal. In this study, to better leverage exogenous variables, especially future exogenous variable, we propose a general framework DAG, which utilizes dual causal network along both the temporal and channel dimensions for time series forecasting with exogenous variables. Specifically, we first introduce the Temporal Causal Module, which includes a causal discovery module to capture how historical exogenous variables affect future exogenous variables. Following this, we construct a causal injection module that incorporates the discovered causal relationships into the process of forecasting future endogenous variables based on historical endogenous variables. Next, we propose the Channel Causal Module, which follows a similar design principle. It features a causal discovery module models how historical exogenous variables influence historical endogenous variables, and a causal injection module incorporates the discovered relationships to enhance the prediction of future endogenous variables based on future exogenous variables.
DBMar 6, 2025
RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database SystemsBiao Ouyang, Yingying Zhang, Hanyin Cheng et al.
With the continued migration of storage to cloud database systems,the impact of slow queries in such systems on services and user experience is increasing. Root-cause diagnosis plays an indispensable role in facilitating slow-query detection and revision. This paper proposes a method capable of both identifying possible root cause types for slow queries and ranking these according to their potential for accelerating slow queries. This enables prioritizing root causes with the highest impact, in turn improving slow-query revision effectiveness. To enable more accurate and detailed diagnoses, we propose the multimodal Ranking for the Root Causes of slow queries (RCRank) framework, which formulates root cause analysis as a multimodal machine learning problem and leverages multimodal information from query statements, execution plans, execution logs, and key performance indicators. To obtain expressive embeddings from its heterogeneous multimodal input, RCRank integrates self-supervised pre-training that enhances cross-modal alignment and task relevance. Next, the framework integrates root-cause-adaptive cross Transformers that enable adaptive fusion of multimodal features with varying characteristics. Finally, the framework offers a unified model that features an impact-aware training objective for identifying and ranking root causes. We report on experiments on real and synthetic datasets, finding that RCRank is capable of consistently outperforming the state-of-the-art methods at root cause identification and ranking according to a range of metrics.
LGOct 27, 2025
DBLoss: Decomposition-based Loss Function for Time Series ForecastingXiangfei Qiu, Xingjian Wu, Hanyin Cheng et al.
Time series forecasting holds significant value in various domains such as economics, traffic, energy, and AIOps, as accurate predictions facilitate informed decision-making. However, the existing Mean Squared Error (MSE) loss function sometimes fails to accurately capture the seasonality or trend within the forecasting horizon, even when decomposition modules are used in the forward propagation to model the trend and seasonality separately. To address these challenges, we propose a simple yet effective Decomposition-Based Loss function called DBLoss. This method uses exponential moving averages to decompose the time series into seasonal and trend components within the forecasting horizon, and then calculates the loss for each of these components separately, followed by weighting them. As a general loss function, DBLoss can be combined with any deep learning forecasting model. Extensive experiments demonstrate that DBLoss significantly improves the performance of state-of-the-art models across diverse real-world datasets and provides a new perspective on the design of time series loss functions.
LGSep 26, 2025
Unlocking the Power of Mixture-of-Experts for Task-Aware Time Series AnalyticsXingjian Wu, Zhengyu Li, Hanyin Cheng et al.
Time Series Analysis is widely used in various real-world applications such as weather forecasting, financial fraud detection, imputation for missing data in IoT systems, and classification for action recognization. Mixture-of-Experts (MoE), as a powerful architecture, though demonstrating effectiveness in NLP, still falls short in adapting to versatile tasks in time series analytics due to its task-agnostic router and the lack of capability in modeling channel correlations. In this study, we propose a novel, general MoE-based time series framework called PatchMoE to support the intricate ``knowledge'' utilization for distinct tasks, thus task-aware. Based on the observation that hierarchical representations often vary across tasks, e.g., forecasting vs. classification, we propose a Recurrent Noisy Gating to utilize the hierarchical information in routing, thus obtaining task-sepcific capability. And the routing strategy is operated on time series tokens in both temporal and channel dimensions, and encouraged by a meticulously designed Temporal \& Channel Load Balancing Loss to model the intricate temporal and channel correlations. Comprehensive experiments on five downstream tasks demonstrate the state-of-the-art performance of PatchMoE.
LGDec 16, 2025
FLAME: Flow Enhanced Legendre Memory Models for General Time Series ForecastingXingjian Wu, Hanyin Cheng, Xiangfei Qiu et al.
In this work, we introduce FLAME, a family of extremely lightweight and capable Time Series Foundation Models, which support both deterministic and probabilistic forecasting via generative probabilistic modeling, thus ensuring both efficiency and robustness. FLAME utilizes the Legendre Memory for strong generalization capabilities. Through adapting variants of Legendre Memory, i.e., translated Legendre (LegT) and scaled Legendre (LegS), in the Encoding and Decoding phases, FLAME can effectively capture the inherent inductive bias within data and make efficient long-range inferences. To enhance the accuracy of probabilistic forecasting while keeping efficient, FLAME adopts a Normalization Flow based forecasting head, which can model the arbitrarily intricate distributions over the forecasting horizon in a generative manner. Comprehensive experiments on well-recognized benchmarks, including TSFM-Bench and ProbTS, demonstrate the consistent state-of-the-art zero-shot performance of FLAME on both deterministic and probabilistic forecasting tasks.
LGOct 15, 2025
STAR: Boosting Time Series Foundation Models for Anomaly Detection through State-aware AdapterHanyin Cheng, Ruitong Zhang, Yuning Lu et al.
While Time Series Foundation Models (TSFMs) have demonstrated remarkable success in Multivariate Time Series Anomaly Detection (MTSAD), however, in real-world industrial scenarios, many time series comprise not only numerical variables such as temperature and flow, but also numerous discrete state variables that describe the system status, such as valve on/off or day of the week. Existing TSFMs often overlook the distinct categorical nature of state variables and their critical role as conditions, typically treating them uniformly with numerical variables. This inappropriate modeling approach prevents the model from fully leveraging state information and even leads to a significant degradation in detection performance after state variables are integrated. To address this critical limitation, this paper proposes a novel STate-aware AdapteR (STAR). STAR is a plug-and-play module designed to enhance the capability of TSFMs in modeling and leveraging state variables during the fine-tuning stage. Specifically, STAR comprisesthree core components: (1) We design an Identity-guided State Encoder, whicheffectively captures the complex categorical semantics of state variables through a learnable State Memory. (2) We propose a Conditional Bottleneck Adapter, which dynamically generates low-rank adaptation parameters conditioned on the current state, thereby flexibly injecting the influence of state variables into the backbone model. (3) We also introduce a Numeral-State Matching module to more effectively detect anomalies inherent to the state variables themselves. Extensive experiments conducted on real-world datasets demonstrate that STAR can improve the performance of existing TSFMs on MTSAD.
LGSep 28, 2025
Multi-Scale Spatial-Temporal Hypergraph Network with Lead-Lag Structures for Stock Time Series ForecastingXiangfei Qiu, Liu Yang, Hanyin Cheng et al.
Time series forecasting occurs in a range of financial applications providing essential decision-making support to investors, regulatory institutions, and analysts. Unlike multivariate time series from other domains, stock time series exhibit industry correlation. Exploiting this kind of correlation can improve forecasting accuracy. However, existing methods based on hypergraphs can only capture industry correlation relatively superficially. These methods face two key limitations: they do not fully consider inter-industry lead-lag interactions, and they do not model multi-scale information within and among industries. This study proposes the Hermes framework for stock time series forecasting that aims to improve the exploitation of industry correlation by eliminating these limitations. The framework integrates moving aggregation and multi-scale fusion modules in a hypergraph network. Specifically, to more flexibly capture the lead-lag relationships among industries, Hermes proposes a hyperedge-based moving aggregation module. This module incorporates a sliding window and utilizes dynamic temporal aggregation operations to consider lead-lag dependencies among industries. Additionally, to effectively model multi-scale information, Hermes employs cross-scale, edge-to-edge message passing to integrate information from different scales while maintaining the consistency of each scale. Experimental results on multiple real-world stock datasets show that Hermes outperforms existing state-of-the-art methods in both efficiency and accuracy.
LGSep 27, 2025
ASTGI: Adaptive Spatio-Temporal Graph Interactions for Irregular Multivariate Time Series ForecastingXvyuan Liu, Xiangfei Qiu, Hanyin Cheng et al.
Irregular multivariate time series (IMTS) are prevalent in critical domains like healthcare and finance, where accurate forecasting is vital for proactive decision-making. However, the asynchronous sampling and irregular intervals inherent to IMTS pose two core challenges for existing methods: (1) how to accurately represent the raw information of irregular time series without introducing data distortion, and (2) how to effectively capture the complex dynamic dependencies between observation points. To address these challenges, we propose the Adaptive Spatio-Temporal Graph Interaction (ASTGI) framework. Specifically, the framework first employs a Spatio-Temporal Point Representation module to encode each discrete observation as a point within a learnable spatio-temporal embedding space. Second, a Neighborhood-Adaptive Graph Construction module adaptively builds a causal graph for each point in the embedding space via nearest neighbor search. Subsequently, a Spatio-Temporal Dynamic Propagation module iteratively updates information on these adaptive causal graphs by generating messages and computing interaction weights based on the relative spatio-temporal positions between points. Finally, a Query Point-based Prediction module generates the final forecast by aggregating neighborhood information for a new query point and performing regression. Extensive experiments on multiple benchmark datasets demonstrate that ASTGI outperforms various state-of-the-art methods.