LGApr 6

General Multimodal Protein Design Enables DNA-Encoding of Chemistry

arXiv:2604.0518199.91 citationsh-index: 58Has Code
AI Analysis

This work addresses the challenge of expanding genetically encodable transformations for applications in biotechnology and synthetic biology, representing a novel method rather than an incremental improvement.

The paper tackled the problem of designing new enzymes without pre-specifying catalytic residues, resulting in the creation of diverse heme enzymes that catalyze new-to-nature carbene-transfer reactions with high activities exceeding those of engineered enzymes.

Evolution is an extraordinary engine for enzymatic diversity, yet the chemistry it has explored remains a narrow slice of what DNA can encode. Deep generative models can design new proteins that bind ligands, but none have created enzymes without pre-specifying catalytic residues. We introduce DISCO (DIffusion for Sequence-structure CO-design), a multimodal model that co-designs protein sequence and 3D structure around arbitrary biomolecules, as well as inference-time scaling methods that optimize objectives across both modalities. Conditioned solely on reactive intermediates, DISCO designs diverse heme enzymes with novel active-site geometries. These enzymes catalyze new-to-nature carbene-transfer reactions, including alkene cyclopropanation, spirocyclopropanation, B-H, and C(sp$^3$)-H insertions, with high activities exceeding those of engineered enzymes. Random mutagenesis of a selected design further confirmed that enzyme activity can be improved through directed evolution. By providing a scalable route to evolvable enzymes, DISCO broadens the potential scope of genetically encodable transformations. Code is available at https://github.com/DISCO-design/DISCO.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes