Nonlinear network-based quantitative trait prediction from transcriptomic data
This work addresses the problem of quantitative trait prediction in molecular biology, offering a parametric model that improves interpretability for researchers, though it is incremental in its methodological contributions.
The authors tackled the challenge of predicting quantitative traits from transcriptomic data by developing a novel approach that accounts for sample heterogeneity and hidden gene regulatory networks, demonstrating competitive predictive performance and interpretability on both simulated data and real Drosophila data for alcohol sensitivity.
Quantitatively predicting phenotype variables by the expression changes in a set of candidate genes is of great interest in molecular biology but it is also a challenging task for several reasons. First, the collected biological observations might be heterogeneous and correspond to different biological mechanisms. Secondly, the gene expression variables used to predict the phenotype are potentially highly correlated since genes interact though unknown regulatory networks. In this paper, we present a novel approach designed to predict quantitative trait from transcriptomic data, taking into account the heterogeneity in biological samples and the hidden gene regulatory networks underlying different biological mechanisms. The proposed model performs well on prediction but it is also fully parametric, which facilitates the downstream biological interpretation. The model provides clusters of individuals based on the relation between gene expression data and the phenotype, and also leads to infer a gene regulatory network specific for each cluster of individuals. We perform numerical simulations to demonstrate that our model is competitive with other prediction models, and we demonstrate the predictive performance and the interpretability of our model to predict alcohol sensitivity from transcriptomic data on real data from Drosophila Melanogaster Genetic Reference Panel (DGRP).