LGJan 29, 2024

A Bayesian Gaussian Process-Based Latent Discriminative Generative Decoder (LDGD) Model for High-Dimensional Data

arXiv:2401.16497v35 citationsh-index: 39IEEE Access
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem in machine learning for analyzing high-dimensional data, offering incremental improvements in efficiency and robustness for limited sample scenarios.

The paper tackles the challenge of extracting meaningful information from high-dimensional, noisy data by proposing a non-parametric Gaussian process-based model called LDGD that maps data to a latent manifold, achieving superior label prediction accuracy compared to state-of-the-art methods.

Extracting meaningful information from high-dimensional data poses a formidable modeling challenge, particularly when the data is obscured by noise or represented through different modalities. This research proposes a novel non-parametric modeling approach, leveraging the Gaussian process (GP), to characterize high-dimensional data by mapping it to a latent low-dimensional manifold. This model, named the latent discriminative generative decoder (LDGD), employs both the data and associated labels in the manifold discovery process. We derive a Bayesian solution to infer the latent variables, allowing LDGD to effectively capture inherent stochasticity in the data. We demonstrate applications of LDGD on both synthetic and benchmark datasets. Not only does LDGD infer the manifold accurately, but its accuracy in predicting data points' labels surpasses state-of-the-art approaches. In the development of LDGD, we have incorporated inducing points to reduce the computational complexity of Gaussian processes for large datasets, enabling batch training for enhanced efficient processing and scalability. Additionally, we show that LDGD can robustly infer manifold and precisely predict labels for scenarios in that data size is limited, demonstrating its capability to efficiently characterize high-dimensional data with limited samples. These collective attributes highlight the importance of developing non-parametric modeling approaches to analyze high-dimensional data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes