QMLGJul 11, 2018

Estimating Cellular Goals from High-Dimensional Biological Data

arXiv:1807.04245v45 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in modeling cell metabolism for applications like antimicrobial development and cancer drug discovery, representing an incremental improvement in scalability over existing methods.

The paper tackles the problem of estimating cellular goals from high-dimensional biological data by developing a scalable approach to learn constraint reactions in metabolic models, enabling applications to realistically large models with up to 75 organisms and recovering reactions even with missing measurements.

Optimization-based models have been used to predict cellular behavior for over 25 years. The constraints in these models are derived from genome annotations, measured macro-molecular composition of cells, and by measuring the cell's growth rate and metabolism in different conditions. The cellular goal (the optimization problem that the cell is trying to solve) can be challenging to derive experimentally for many organisms, including human or mammalian cells, which have complex metabolic capabilities and are not well understood. Existing approaches to learning goals from data include (a) estimating a linear objective function, or (b) estimating linear constraints that model complex biochemical reactions and constrain the cell's operation. The latter approach is important because often the known/observed biochemical reactions are not enough to explain observations, and hence there is a need to extend automatically the model complexity by learning new chemical reactions. However, this leads to nonconvex optimization problems, and existing tools cannot scale to realistically large metabolic models. Hence, constraint estimation is still used sparingly despite its benefits for modeling cell metabolism, which is important for developing novel antimicrobials against pathogens, discovering cancer drug targets, and producing value-added chemicals. Here, we develop the first approach to estimating constraint reactions from data that can scale to realistically large metabolic models. Previous tools have been used on problems having less than 75 biochemical reactions and 60 metabolites, which limits real-life-size applications. We perform extensive experiments using 75 large-scale metabolic network models for different organisms (including bacteria, yeasts, and mammals) and show that our algorithm can recover cellular constraint reactions, even when some measurements are missing.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes