Adaptive Protein Design Protocols and Middleware
This work addresses the computational bottleneck in protein design for researchers, though it appears incremental as it builds on existing AI/ML methods with improved infrastructure.
The paper tackles the computational challenge of sampling vast protein sequence and structure spaces by introducing IMPRESS, an adaptive protocol and middleware that couples AI with high-performance computing, resulting in increased consistency and throughput in protein design.
Computational protein design is experiencing a transformation driven by AI/ML. However, the range of potential protein sequences and structures is astronomically vast, even for moderately sized proteins. Hence, achieving convergence between generated and predicted structures demands substantial computational resources for sampling. The Integrated Machine-learning for Protein Structures at Scale (IMPRESS) offers methods and advanced computing systems for coupling AI to high-performance computing tasks, enabling the ability to evaluate the effectiveness of protein designs as they are developed, as well as the models and simulations used to generate data and train models. This paper introduces IMPRESS and demonstrates the development and implementation of an adaptive protein design protocol and its supporting computing infrastructure. This leads to increased consistency in the quality of protein design and enhanced throughput of protein design due to dynamic resource allocation and asynchronous workload execution.