Simulation of virtual cohorts increases predictive accuracy of cognitive decline in MCI subjects
This work addresses the challenge of limited longitudinal data for predicting biomarker progression in neurodegenerative diseases like Alzheimer's, offering a domain-specific solution that is incremental in nature.
The researchers tackled the problem of predicting cognitive decline in mild cognitive impairment (MCI) subjects by developing a data augmentation technique to simulate virtual cohorts, which increased the size and variability of longitudinal training data. They demonstrated a 37% improvement in mean absolute error for predicting MMSE scores, achieving errors comparable to data noise levels.
The ability to predict the progression of biomarkers, notably in NDD, is limited by the size of the longitudinal data sets, in terms of number of patients, number of visits per patients and total follow-up time. To this end, we introduce a data augmentation technique that is able to reproduce the variability seen in a longitudinal training data set and simulate continuous biomarkers trajectories for any number of virtual patients. Thanks to this simulation framework, we propose to transform the training set into a simulated data set with more patients, more time-points per patient and longer follow-up duration. We illustrate this approach on the prediction of the MMSE of MCI subjects of the ADNI data set. We show that it allows to reach predictions with errors comparable to the noise in the data, estimated in test/retest studies, achieving a improvement of 37% of the mean absolute error compared to the same non-augmented model.