APLGNov 7, 2020

Bayesian Nonparametric Dimensionality Reduction of Categorical Data for Predicting Severity of COVID-19 in Pregnant Women

arXiv:2011.03715v2
AI Analysis

This work addresses a specific clinical prediction problem for underrepresented pregnant women with COVID-19, but it is incremental as it applies an existing Bayesian method to a new dataset.

The authors tackled the problem of predicting COVID-19 severity in pregnant women by modeling multivariate categorical clinical data using Bayesian nonparametric dimensionality reduction with latent Gaussian processes, achieving better prediction accuracy compared to dummy encoding.

The coronavirus disease (COVID-19) has rapidly spread throughout the world and while pregnant women present the same adverse outcome rates, they are underrepresented in clinical research. We collected clinical data of 155 test-positive COVID-19 pregnant women at Stony Brook University Hospital. Many of these collected data are of multivariate categorical type, where the number of possible outcomes grows exponentially as the dimension of data increases. We modeled the data within the unsupervised Bayesian framework and mapped them into a lower-dimensional space using latent Gaussian processes. The latent features in the lower dimensional space were further used for predicting if a pregnant woman would be admitted to a hospital due to COVID-19 or would remain with mild symptoms. We compared the prediction accuracy with the dummy/one-hot encoding of categorical data and found that the latent Gaussian process had better accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes