ML LGJan 16

Split-and-Conquer: Distributed Factor Modeling for High-Dimensional Matrix-Variate Time Series

arXiv:2601.11091v11.7h-index: 1

Originality Incremental advance

AI Analysis

This work addresses the challenge of handling large-scale matrix-variate time series data for researchers and practitioners in fields like finance or sensor networks, but it is incremental as it builds on existing distributed approaches with a focus on preserving matrix structure.

The paper tackles the problem of dimensionality reduction for high-dimensional, large-scale, heterogeneous matrix-variate time series data by proposing a distributed factor modeling framework that preserves the latent matrix structure, resulting in improved computational efficiency and enhanced information utilization, with simulation results assessing its computational efficiency and estimation accuracy.

In this paper, we propose a distributed framework for reducing the dimensionality of high-dimensional, large-scale, heterogeneous matrix-variate time series data using a factor model. The data are first partitioned column-wise (or row-wise) and allocated to node servers, where each node estimates the row (or column) loading matrix via two-dimensional tensor PCA. These local estimates are then transmitted to a central server and aggregated, followed by a final PCA step to obtain the global row (or column) loading matrix estimator. Given the estimated loading matrices, the corresponding factor matrices are subsequently computed. Unlike existing distributed approaches, our framework preserves the latent matrix structure, thereby improving computational efficiency and enhancing information utilization. We also discuss row- and column-wise clustering procedures for settings in which the group memberships are unknown. Furthermore, we extend the analysis to unit-root nonstationary matrix-variate time series. Asymptotic properties of the proposed method are derived for the diverging dimension of the data in each computing unit and the sample size $T$. Simulation results assess the computational efficiency and estimation accuracy of the proposed framework, and real data applications further validate its predictive performance.

View on arXiv PDF

Similar