Active learning for energy-based antibody optimization and enhanced screening
This work addresses the problem of inefficient large-scale mutant screening for antibody optimization, offering a domain-specific incremental improvement by combining machine learning and physics-based methods.
The paper tackled the challenge of predicting and optimizing protein-protein binding affinity for therapeutic antibody development by proposing an active learning workflow that trains a deep learning model to learn energy functions for specific targets, resulting in significantly improved screening performance over random selection in a case study targeting HER2-binding Trastuzumab mutants.
Accurate prediction and optimization of protein-protein binding affinity is crucial for therapeutic antibody development. Although machine learning-based prediction methods $ΔΔG$ are suitable for large-scale mutant screening, they struggle to predict the effects of multiple mutations for targets without existing binders. Energy function-based methods, though more accurate, are time consuming and not ideal for large-scale screening. To address this, we propose an active learning workflow that efficiently trains a deep learning model to learn energy functions for specific targets, combining the advantages of both approaches. Our method integrates the RDE-Network deep learning model with Rosetta's energy function-based Flex ddG to efficiently explore mutants. In a case study targeting HER2-binding Trastuzumab mutants, our approach significantly improved the screening performance over random selection and demonstrated the ability to identify mutants with better binding properties without experimental $ΔΔG$ data. This workflow advances computational antibody design by combining machine learning, physics-based computations, and active learning to achieve more efficient antibody development.