MLLGNov 11, 2024

Unified Bayesian representation for high-dimensional multi-modal biomedical data for small-sample classification

arXiv:2411.07043v12 citationsh-index: 31
Originality Incremental advance
AI Analysis

This addresses the challenge of biomarker identification in biomedical research where data is limited and multi-modal, though it appears incremental as it builds on existing Bayesian and kernel methods.

The authors tackled the problem of classifying high-dimensional multi-modal biomedical data with small sample sizes by developing BALDUR, a Bayesian algorithm that outperformed state-of-the-art models on two neurodegeneration datasets.

We present BALDUR, a novel Bayesian algorithm designed to deal with multi-modal datasets and small sample sizes in high-dimensional settings while providing explainable solutions. To do so, the proposed model combines within a common latent space the different data views to extract the relevant information to solve the classification task and prune out the irrelevant/redundant features/data views. Furthermore, to provide generalizable solutions in small sample size scenarios, BALDUR efficiently integrates dual kernels over the views with a small sample-to-feature ratio. Finally, its linear nature ensures the explainability of the model outcomes, allowing its use for biomarker identification. This model was tested over two different neurodegeneration datasets, outperforming the state-of-the-art models and detecting features aligned with markers already described in the scientific literature.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes