Interpretable Biomanufacturing Process Risk and Sensitivity Analyses for Quality-by-Design and Stability Control
This addresses quality-by-design and stability issues in biomanufacturing, particularly for personalized therapies, but is incremental as it builds on existing Bayesian and game theory methods.
The authors tackled the challenge of high variability and limited data in biomanufacturing by developing an interpretable probabilistic knowledge graph and Shapley value-based sensitivity analysis to quantify contributions of process parameters, enabling identification of bottlenecks and improved stability control.
While biomanufacturing plays a significant role in supporting the economy and ensuring public health, it faces critical challenges, including complexity, high variability, lengthy lead time, and very limited process data, especially for personalized new cell and gene biotherapeutics. Driven by these challenges, we propose an interpretable semantic bioprocess probabilistic knowledge graph and develop a game theory based risk and sensitivity analyses for production process to facilitate quality-by-design and stability control. Specifically, by exploring the causal relationships and interactions of critical process parameters and quality attributes (CPPs/CQAs), we create a Bayesian network based probabilistic knowledge graph characterizing the complex causal interdependencies of all factors. Then, we introduce a Shapley value based sensitivity analysis, which can correctly quantify the variation contribution from each input factor on the outputs (i.e., productivity, product quality). Since the bioprocess model coefficients are learned from limited process observations, we derive the Bayesian posterior distribution to quantify model uncertainty and further develop the Shapley value based sensitivity analysis to evaluate the impact of estimation uncertainty from each set of model coefficients. Therefore, the proposed bioprocess risk and sensitivity analyses can identify the bottlenecks, guide the reliable process specifications and the most "informative" data collection, and improve production stability.