MLLGSep 17, 2019

BLOCCS: Block Sparse Canonical Correlation Analysis With Application To Interpretable Omics Integration

arXiv:1909.07944v21 citations
AI Analysis

This work addresses the need for more interpretable and efficient integration of omics data in bioinformatics, though it is incremental as it builds on prior sparse CCA methods.

The authors tackled the problem of improving interpretability and computational efficiency in sparse canonical correlation analysis (sCCA) by introducing a block sparse CCA method that estimates multiple pairs of canonical directions simultaneously, resulting in better orthogonality and more interpretable solutions. Their method outperformed existing sCCA algorithms in simulations, showing improved computational cost and stability, and was applied to multi-omic cancer data to capture meaningful biological associations.

We introduce Block Sparse Canonical Correlation Analysis which estimates multiple pairs of canonical directions (together a "block") at once, resulting in significantly improved orthogonality of the sparse directions which, we demonstrate, translates to more interpretable solutions. Our approach builds on the sparse CCA method of (Solari, Brown, and Bickel 2019) in that we also express the bi-convex objective of our block formulation as a concave minimization problem over an orthogonal k-frame in a unit Euclidean ball, which in turn, due to concavity of the objective, is shrunk to a Stiefel manifold, which is optimized via gradient descent algorithm. Our simulations show that our method outperforms existing sCCA algorithms and implementations in terms of computational cost and stability, mainly due to the drastic shrinkage of our search space, and the correlation within and orthogonality between pairs of estimated canonical covariates. Finally, we apply our method, available as an R-package called BLOCCS, to multi-omic data on Lung Squamous Cell Carcinoma(LUSC) obtained via The Cancer Genome Atlas, and demonstrate its capability in capturing meaningful biological associations relevant to the hypothesis under study rather than spurious dominant variations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes