Abstracted Shapes as Tokens -- A Generalizable and Interpretable Model for Time-series Classification
This work addresses the need for interpretable and generalizable models in time-series analysis, which is crucial for domains like healthcare and finance, though it is incremental in combining existing techniques for improved explainability.
The paper tackles the problem of black-box time-series models by introducing VQShape, a pre-trained model that uses vector quantization to represent time-series as abstract shapes, achieving comparable classification performance to specialist models and demonstrating zero-shot generalization to unseen domains.
In time-series analysis, many recent works seek to provide a unified view and representation for time-series across multiple domains, leading to the development of foundation models for time-series data. Despite diverse modeling techniques, existing models are black boxes and fail to provide insights and explanations about their representations. In this paper, we present VQShape, a pre-trained, generalizable, and interpretable model for time-series representation learning and classification. By introducing a novel representation for time-series data, we forge a connection between the latent space of VQShape and shape-level features. Using vector quantization, we show that time-series from different domains can be described using a unified set of low-dimensional codes, where each code can be represented as an abstracted shape in the time domain. On classification tasks, we show that the representations of VQShape can be utilized to build interpretable classifiers, achieving comparable performance to specialist models. Additionally, in zero-shot learning, VQShape and its codebook can generalize to previously unseen datasets and domains that are not included in the pre-training process. The code and pre-trained weights are available at https://github.com/YunshiWen/VQShape.