LG MLOct 11, 2019

ORCCA: Optimal Randomized Canonical Correlation Analysis

arXiv:1910.05384v34.85 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving kernel approximation for CCA, a domain-specific incremental advancement in machine learning.

The authors tackled the problem of selecting random features for kernel approximation in Canonical Correlation Analysis (CCA) by proposing a task-specific scoring rule, ORCCA, which maximizes canonical correlations and outperforms Kernel CCA with a default kernel in expectation, as verified by numerical experiments showing significant superiority over other approximation techniques.

Random features approach has been widely used for kernel approximation in large-scale machine learning. A number of recent studies have explored data-dependent sampling of features, modifying the stochastic oracle from which random features are sampled. While proposed techniques in this realm improve the approximation, their suitability is often verified on a single learning task. In this paper, we propose a task-specific scoring rule for selecting random features, which can be employed for different applications with some adjustments. We restrict our attention to Canonical Correlation Analysis (CCA), and we provide a novel, principled guide for finding the score function maximizing the canonical correlations. We prove that this method, called ORCCA, can outperform (in expectation) the corresponding Kernel CCA with a default kernel. Numerical experiments verify that ORCCA is significantly superior than other approximation techniques in the CCA task.

View on arXiv PDF

Similar