MLJul 1, 2013

Dimensionality Detection and Integration of Multiple Data Sources via the GP-LVM

arXiv:1307.0323v1
Originality Incremental advance
AI Analysis

This work addresses dimensionality reduction and data integration for researchers in machine learning, but it is incremental as it builds on existing GP-LVM methods.

The paper tackled the problem of determining the optimal number of latent variables and integrating multiple data sources using the Gaussian Process Latent Variable Model (GP-LVM), resulting in more robust performance and significant gains in prediction accuracy for binary classification tasks.

The Gaussian Process Latent Variable Model (GP-LVM) is a non-linear probabilistic method of embedding a high dimensional dataset in terms low dimensional `latent' variables. In this paper we illustrate that maximum a posteriori (MAP) estimation of the latent variables and hyperparameters can be used for model selection and hence we can determine the optimal number or latent variables and the most appropriate model. This is an alternative to the variational approaches developed recently and may be useful when we want to use a non-Gaussian prior or kernel functions that don't have automatic relevance determination (ARD) parameters. Using a second order expansion of the latent variable posterior we can marginalise the latent variables and obtain an estimate for the hyperparameter posterior. Secondly, we use the GP-LVM to integrate multiple data sources by simultaneously embedding them in terms of common latent variables. We present results from synthetic data to illustrate the successful detection and retrieval of low dimensional structure from high dimensional data. We demonstrate that the integration of multiple data sources leads to more robust performance. Finally, we show that when the data are used for binary classification tasks we can attain a significant gain in prediction accuracy when the low dimensional representation is used.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes