QMLGBMJun 10, 2021

Adaptive machine learning for protein engineering

arXiv:2106.05466v2101 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental review for protein engineers, summarizing existing methods without introducing new techniques.

The paper reviews how to use sequence-to-function machine learning models to select protein sequences for experimental measurement, addressing the combinatorial complexity of protein sequences in protein engineering.

Machine-learning models that learn from data to predict how protein sequence encodes function are emerging as a useful protein engineering tool. However, when using these models to suggest new protein designs, one must deal with the vast combinatorial complexity of protein sequences. Here, we review how to use a sequence-to-function machine-learning surrogate model to select sequences for experimental measurement. First, we discuss how to select sequences through a single round of machine-learning optimization. Then, we discuss sequential optimization, where the goal is to discover optimized sequences and improve the model across multiple rounds of training, optimization, and experimental measurement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes