LGMLApr 23, 2020

Sparse Generalized Canonical Correlation Analysis: Distributed Alternating Iteration based Approach

arXiv:2004.10981v13 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of analyzing multiview data with sparse structures for researchers in statistics and machine learning, representing an incremental extension of sparse CCA.

The paper tackles the limitation of sparse canonical correlation analysis (CCA) to two datasets by proposing sparse generalized canonical correlation analysis (GCCA) to detect latent relations in multiview data with sparse structures, and experiments on synthetic and real-world datasets demonstrate its effectiveness.

Sparse canonical correlation analysis (CCA) is a useful statistical tool to detect latent information with sparse structures. However, sparse CCA works only for two datasets, i.e., there are only two views or two distinct objects. To overcome this limitation, in this paper, we propose a sparse generalized canonical correlation analysis (GCCA), which could detect the latent relations of multiview data with sparse structures. Moreover, the introduced sparsity could be considered as Laplace prior on the canonical variates. Specifically, we convert the GCCA into a linear system of equations and impose $\ell_1$ minimization penalty for sparsity pursuit. This results in a nonconvex problem on Stiefel manifold, which is difficult to solve. Motivated by Boyd's consensus problem, an algorithm based on distributed alternating iteration approach is developed and theoretical consistency analysis is investigated elaborately under mild conditions. Experiments on several synthetic and real world datasets demonstrate the effectiveness of the proposed algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes