GraphGDel: Constructing and Learning Graph Representations of Genome-Scale Metabolic Models for Growth-Coupled Gene Deletion Prediction
This work addresses a domain-specific problem in computational biology for researchers optimizing metabolic engineering, but it is incremental as it builds on existing graph and sequence methods.
The paper tackled the problem of predicting growth-coupled gene deletion strategies in genome-scale metabolic models by introducing a systematic pipeline for constructing graph representations and a deep learning framework that integrates these with sequence data, resulting in accuracy improvements of up to 16.26% over baselines.
In genome-scale constraint-based metabolic models, gene deletion strategies are essential for achieving growth-coupled production, where cell growth and target metabolite synthesis occur simultaneously. Despite the inherently networked nature of genome-scale metabolic models, existing computational approaches rely primarily on sequential data and lack graph representations that capture their complex relationships, as both well-defined graph constructions and learning frameworks capable of exploiting them remain largely unexplored. To address this gap, we present a twofold solution. First, we introduce a systematic pipeline for constructing graph representations from constraint-based metabolic models. Second, we develop a deep learning framework that integrates these graph representations with gene and metabolite sequence data to predict growth-coupled gene deletion strategies. Across three metabolic models, our approach consistently outperforms established baselines, with improvements in overall accuracy of 14.04%, 16.26%, and 13.18% over a deep feedforward neural network baseline, 6.17%, 4.96%, and 5.31% over a sequence-learning baseline, and 5.10%, 4.36%, and 4.70% over a topology-aware graph aggregation baseline on the same metabolite graph, respectively. The source code and example datasets are available at: https://github.com/MetNetComp/GraphGDel.