LGNov 15, 2015

Large-Scale Approximate Kernel Canonical Correlation Analysis

arXiv:1511.04773v458 citations
Originality Incremental advance
AI Analysis

This work addresses scalability issues in nonlinear multi-view representation learning for researchers and practitioners dealing with big data, though it is incremental as it builds on existing approximation and optimization techniques.

The paper tackles the computational bottleneck of kernel canonical correlation analysis (KCCA) for large-scale datasets by combining random feature approximations with a stochastic optimization algorithm, enabling application to a speech dataset with 1.4 million samples and 100,000-dimensional features on a standard workstation.

Kernel canonical correlation analysis (KCCA) is a nonlinear multi-view representation learning technique with broad applicability in statistics and machine learning. Although there is a closed-form solution for the KCCA objective, it involves solving an $N\times N$ eigenvalue system where $N$ is the training set size, making its computational requirements in both memory and time prohibitive for large-scale problems. Various approximation techniques have been developed for KCCA. A commonly used approach is to first transform the original inputs to an $M$-dimensional random feature space so that inner products in the feature space approximate kernel evaluations, and then apply linear CCA to the transformed inputs. In many applications, however, the dimensionality $M$ of the random feature space may need to be very large in order to obtain a sufficiently good approximation; it then becomes challenging to perform the linear CCA step on the resulting very high-dimensional data matrices. We show how to use a stochastic optimization algorithm, recently proposed for linear CCA and its neural-network extension, to further alleviate the computation requirements of approximate KCCA. This approach allows us to run approximate KCCA on a speech dataset with $1.4$ million training samples and a random feature space of dimensionality $M=100000$ on a typical workstation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes