Decoding Latent Spaces: Assessing the Interpretability of Time Series Foundation Models for Visual Analytics
This work addresses the challenge of interpreting latent spaces in time series foundation models for researchers and practitioners in visual analytics, though it is incremental as it highlights limitations and suggests future improvements.
The study assessed the interpretability of latent spaces in time series foundation models like MOMENT for visual analytics, finding that fine-tuning improved performance with loss reduction but did not significantly enhance embedding interpretability, requiring further methodological refinements.
The present study explores the interpretability of latent spaces produced by time series foundation models, focusing on their potential for visual analysis tasks. Specifically, we evaluate the MOMENT family of models, a set of transformer-based, pre-trained architectures for multivariate time series tasks such as: imputation, prediction, classification, and anomaly detection. We evaluate the capacity of these models on five datasets to capture the underlying structures in time series data within their latent space projection and validate whether fine tuning improves the clarity of the resulting embedding spaces. Notable performance improvements in terms of loss reduction were observed after fine tuning. Visual analysis shows limited improvement in the interpretability of the embeddings, requiring further work. Results suggest that, although Time Series Foundation Models such as MOMENT are robust, their latent spaces may require additional methodological refinements to be adequately interpreted, such as alternative projection techniques, loss functions, or data preprocessing strategies. Despite the limitations of MOMENT, foundation models supose a big reduction in execution time and so a great advance for interactive visual analytics.