Longitudinal prediction of DNA methylation to forecast epigenetic outcomes
This work addresses the problem of longitudinal epigenetic profiling for researchers studying development and disease, though it is incremental as it applies an existing method to a new biological context.
The researchers tackled the challenge of predicting DNA methylation changes over time in children by developing a probabilistic longitudinal machine learning framework using multi-mean Gaussian processes, which accurately forecasted methylation status at ages 5-7 based on data from ages 0-4.
Interrogating the evolution of biological changes at early stages of life requires longitudinal profiling of molecules, such as DNA methylation, which can be challenging with children. We introduce a probabilistic and longitudinal machine learning framework based on multi-mean Gaussian processes (GPs), accounting for individual and gene correlations across time. This method provides future predictions of DNA methylation status at different individual ages while accounting for uncertainty. Our model is trained on a birth cohort of children with methylation profiled at ages 0-4, and we demonstrated that the status of methylation sites for each child can be accurately predicted at ages 5-7. We show that methylation profiles predicted by multi-mean GPs can be used to estimate other phenotypes, such as epigenetic age, and enable comparison to other health measures of interest. This approach encourages epigenetic studies to move towards longitudinal design for investigating epigenetic changes during development, ageing and disease progression.