Multimodal Regression for Enzyme Turnover Rates Prediction
This work addresses a critical bottleneck in enzyme kinetics for researchers in biotechnology and industrial biocatalysis, though it is incremental as it builds on existing deep learning techniques.
The authors tackled the problem of predicting enzyme turnover rates, which are scarce due to experimental costs, by developing a multimodal framework that integrates enzyme sequences, substrate structures, and environmental factors, achieving superior performance over existing methods.
The enzyme turnover rate is a fundamental parameter in enzyme kinetics, reflecting the catalytic efficiency of enzymes. However, enzyme turnover rates remain scarce across most organisms due to the high cost and complexity of experimental measurements. To address this gap, we propose a multimodal framework for predicting the enzyme turnover rate by integrating enzyme sequences, substrate structures, and environmental factors. Our model combines a pre-trained language model and a convolutional neural network to extract features from protein sequences, while a graph neural network captures informative representations from substrate molecules. An attention mechanism is incorporated to enhance interactions between enzyme and substrate representations. Furthermore, we leverage symbolic regression via Kolmogorov-Arnold Networks to explicitly learn mathematical formulas that govern the enzyme turnover rate, enabling interpretable and accurate predictions. Extensive experiments demonstrate that our framework outperforms both traditional and state-of-the-art deep learning approaches. This work provides a robust tool for studying enzyme kinetics and holds promise for applications in enzyme engineering, biotechnology, and industrial biocatalysis.