Generalizing Correspondence Analysis for Applications in Machine Learning
This work addresses a bottleneck in applying CA to modern machine learning tasks, enabling broader use in fields like multi-view learning, though it is incremental in extending existing CA methods.
The paper tackles the scalability limitations of correspondence analysis (CA) for large, high-dimensional datasets by reformulating it as an information-theoretic problem involving principal inertia components, and demonstrates novel algorithms using deep neural networks to achieve unprecedented scale.
Correspondence analysis (CA) is a multivariate statistical tool used to visualize and interpret data dependencies by finding maximally correlated embeddings of pairs of random variables. CA has found applications in fields ranging from epidemiology to social sciences; however, current methods do not scale to large, high-dimensional datasets. In this paper, we provide a novel interpretation of CA in terms of an information-theoretic quantity called the principal inertia components. We show that estimating the principal inertia components, which consists in solving a functional optimization problem over the space of finite variance functions of two random variable, is equivalent to performing CA. We then leverage this insight to design novel algorithms to perform CA at an unprecedented scale. Particularly, we demonstrate how the principal inertia components can be reliably approximated from data using deep neural networks. Finally, we show how these maximally correlated embeddings of pairs of random variables in CA further play a central role in several learning problems including visualization of classification boundary and training process, and underlying recent multi-view and multi-modal learning methods.