Estimating shared subspace with AJIVE: the power and limitation of multiple data matrices
It addresses the challenge of disentangling joint and individual variations in integrative data analysis for researchers, providing theoretical insights into performance limits, but is incremental as it builds on existing JIVE methods.
This paper analyzes the Angle-based Joint and Individual Variation Explained (AJIVE) method for estimating shared subspaces in multi-matrix data, showing that its error decreases with more matrices in high signal-to-noise ratio (SNR) regimes but remains non-diminishing in low-SNR settings, with minimax lower bounds confirming optimality in high-SNR cases.
Integrative data analysis often requires disentangling joint and individual variations across multiple datasets, a challenge commonly addressed by the Joint and Individual Variation Explained (JIVE) model. While numerous methods have been developed to estimate the shared subspace under JIVE, the theoretical understanding of their performance remains limited, particularly in the context of multiple matrices and varying degrees of subspace misalignment. This paper bridges this gap by providing a systematic analysis of shared subspace estimation in multi-matrix settings. We focus on the Angle-based Joint and Individual Variation Explained (AJIVE) method, a two-stage spectral approach, and establish new performance guarantees that uncover its strengths and limitations. Specifically, we show that in high signal-to-noise ratio (SNR) regimes, AJIVE's estimation error decreases with the number of matrices, demonstrating the power of multi-matrix integration. Conversely, in low-SNR settings, AJIVE exhibits a non-diminishing error, highlighting fundamental limitations. To complement these results, we derive minimax lower bounds, showing that AJIVE achieves optimal rates in high-SNR regimes. Furthermore, we analyze an oracle-aided spectral estimator to demonstrate that the non-diminishing error in low-SNR scenarios is a fundamental barrier. Extensive numerical experiments corroborate our theoretical findings, providing insights into the interplay between SNR, the number of matrices, and subspace misalignment.