BMLGMay 3, 2023

Exploring the Protein Sequence Space with Global Generative Models

arXiv:2305.01941v18 citations
Originality Synthesis-oriented
AI Analysis

It provides an overview for researchers in computational biology and protein engineering, but is incremental as it synthesizes existing literature without presenting new results.

This book chapter reviews the use of global generative models, such as language models, to explore protein sequence space, focusing on designing novel artificial proteins, non-Transformer architectures, and applications in directed evolution.

Recent advancements in specialized large-scale architectures for training image and language have profoundly impacted the field of computer vision and natural language processing (NLP). Language models, such as the recent ChatGPT and GPT4 have demonstrated exceptional capabilities in processing, translating, and generating human languages. These breakthroughs have also been reflected in protein research, leading to the rapid development of numerous new methods in a short time, with unprecedented performance. Language models, in particular, have seen widespread use in protein research, as they have been utilized to embed proteins, generate novel ones, and predict tertiary structures. In this book chapter, we provide an overview of the use of protein generative models, reviewing 1) language models for the design of novel artificial proteins, 2) works that use non-Transformer architectures, and 3) applications in directed evolution approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes