APMLSep 11, 2021

Microbiome subcommunity learning with logistic-tree normal latent Dirichlet allocation

arXiv:2109.05386v37 citations
Originality Incremental advance
AI Analysis

This work addresses a specific limitation in microbiome data analysis for researchers, but it is incremental as it builds on existing LDA methods by adding heterogeneity modeling.

The authors tackled the problem of cross-sample heterogeneity in microbiome subcommunity compositions, which causes sensitivity to parameter specification in existing models like LDA, by developing a new mixed-membership model that incorporates logistic-tree normal distributions to account for this variation, resulting in more robust inference and meaningful subcommunity identification.

Mixed-membership (MM) models such as Latent Dirichlet Allocation (LDA) have been applied to microbiome compositional data to identify latent subcommunities of microbial species. These subcommunities are informative for understanding the biological interplay of microbes and for predicting health outcomes. However, microbiome compositions typically display substantial cross-sample heterogeneities in subcommunity compositions -- that is, the variability in the proportions of microbes in shared subcommunities across samples -- which is not accounted for in prior analyses. As a result, LDA can produce inference which is highly sensitive to the specification of the number of subcommunities and often divides a single subcommunity into multiple artificial ones. To address this limitation, we incorporate the logistic-tree normal (LTN) model into LDA to form a new MM model. This model allows cross-sample variation in the composition of each subcommunity around some "centroid" composition that defines the subcommunity. Incorporation of auxiliary Pólya-Gamma variables enables a computationally efficient collapsed blocked Gibbs sampler to carry out Bayesian inference under this model. By accounting for such heterogeneity, our new model restores the robustness of the inference in the specification of the number of subcommunities and allows meaningful subcommunities to be identified.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes