Partial Channel Dependence with Channel Masks for Time Series Foundation Models
This work addresses a key limitation in time series foundation models for researchers and practitioners by enabling more sophisticated handling of channel dependencies, though it is incremental as it builds on existing model architectures.
The paper tackles the problem of implicit heterogeneity in time series data, such as varying dependencies between channels, by introducing partial channel dependence (PCD) with a channel mask, resulting in improved performance across forecasting, classification, imputation, and anomaly detection tasks under diverse settings.
Recent advancements in foundation models have been successfully extended to the time series (TS) domain, facilitated by the emergence of large-scale TS datasets. However, previous efforts have primarily focused on designing model architectures to address explicit heterogeneity among datasets such as various numbers of channels, while often overlooking implicit heterogeneity such as varying dependencies between channels. In this work, we introduce the concept of partial channel dependence (PCD), which enables a more sophisticated adjustment of channel dependencies based on dataset-specific information. To achieve PCD, we propose a channel mask that captures the relationships between channels within a dataset using two key components: 1) a correlation matrix that encodes relative dependencies between channels, and 2) domain parameters that learn the absolute dependencies specific to each dataset, refining the correlation matrix. We validate the effectiveness of PCD across four tasks in TS including forecasting, classification, imputation, and anomaly detection, under diverse settings, including few-shot and zero-shot scenarios with both TS foundation models and single-task models. Code is available at https://github.com/seunghan96/CM.