Phenotype Inference with Semi-Supervised Mixed Membership Models
This addresses the challenge of reducing expert labeling effort in healthcare data analysis, though it appears incremental as it builds on existing mixed membership models.
The authors tackled the problem of disease phenotyping from clinical data by proposing a semi-supervised mixed membership model (SS3M) that learns interpretable, disease-specific phenotypes with relatively few labels, achieving results that capture clinical characteristics as specified by the labels.
Disease phenotyping algorithms process observational clinical data to identify patients with specific diseases. Supervised phenotyping methods require significant quantities of expert-labeled data, while unsupervised methods may learn non-disease phenotypes. To address these limitations, we propose the Semi-Supervised Mixed Membership Model (SS3M) -- a probabilistic graphical model for learning disease phenotypes from clinical data with relatively few labels. We show SS3M can learn interpretable, disease-specific phenotypes which capture the clinical characteristics of the diseases specified by the labels provided.