InfoDPCCA: Information-Theoretic Dynamic Probabilistic Canonical Correlation Analysis
This work addresses the challenge of modeling interdependent sequential data for applications in natural science and engineering, representing an incremental improvement over existing dynamic CCA approaches.
The authors tackled the problem of extracting meaningful latent representations from high-dimensional sequential data by introducing InfoDPCCA, a dynamic probabilistic CCA framework that uses an information-theoretic objective to capture mutual structure between two data streams while balancing compression and predictive sufficiency. They demonstrated its effectiveness on synthetic and medical fMRI data, showing improved interpretability and robustness compared to prior methods.
Extracting meaningful latent representations from high-dimensional sequential data is a crucial challenge in machine learning, with applications spanning natural science and engineering. We introduce InfoDPCCA, a dynamic probabilistic Canonical Correlation Analysis (CCA) framework designed to model two interdependent sequences of observations. InfoDPCCA leverages a novel information-theoretic objective to extract a shared latent representation that captures the mutual structure between the data streams and balances representation compression and predictive sufficiency while also learning separate latent components that encode information specific to each sequence. Unlike prior dynamic CCA models, such as DPCCA, our approach explicitly enforces the shared latent space to encode only the mutual information between the sequences, improving interpretability and robustness. We further introduce a two-step training scheme to bridge the gap between information-theoretic representation learning and generative modeling, along with a residual connection mechanism to enhance training stability. Through experiments on synthetic and medical fMRI data, we demonstrate that InfoDPCCA excels as a tool for representation learning. Code of InfoDPCCA is available at https://github.com/marcusstang/InfoDPCCA.