Samuel D. Stanton

h-index21
2papers

2 Papers

BMJan 30
Disentangling multispecific antibody function with graph neural networks

Joshua Southern, Changpeng Lu, Santrupti Nerli et al.

Multispecific antibodies offer transformative therapeutic potential by engaging multiple epitopes simultaneously, yet their efficacy is an emergent property governed by complex molecular architectures. Rational design is often bottlenecked by the inability to predict how subtle changes in domain topology influence functional outcomes, a challenge exacerbated by the scarcity of comprehensive experimental data. Here, we introduce a computational framework to address part of this gap. First, we present a generative method for creating large-scale, realistic synthetic functional landscapes that capture non-linear interactions where biological activity depends on domain connectivity. Second, we propose a graph neural network architecture that explicitly encodes these topological constraints, distinguishing between format configurations that appear identical to sequence-only models. We demonstrate that this model, trained on synthetic landscapes, recapitulates complex functional properties and, via transfer learning, has the potential to achieve high predictive accuracy on limited biological datasets. We showcase the model's utility by optimizing trade-offs between efficacy and toxicity in trispecific T-cell engagers and retrieving optimal common light chains. This work provides a robust benchmarking environment for disentangling the combinatorial complexity of multispecifics, accelerating the design of next-generation therapeutics.

LGOct 29, 2024
Generalists vs. Specialists: Evaluating LLMs on Highly-Constrained Biophysical Sequence Optimization Tasks

Angelica Chen, Samuel D. Stanton, Frances Ding et al.

Although large language models (LLMs) have shown promise in biomolecule optimization problems, they incur heavy computational costs and struggle to satisfy precise constraints. On the other hand, specialized solvers like LaMBO-2 offer efficiency and fine-grained control but require more domain expertise. Comparing these approaches is challenging due to expensive laboratory validation and inadequate synthetic benchmarks. We address this by introducing Ehrlich functions, a synthetic test suite that captures the geometric structure of biophysical sequence optimization problems. With prompting alone, off-the-shelf LLMs struggle to optimize Ehrlich functions. In response, we propose LLOME (Language Model Optimization with Margin Expectation), a bilevel optimization routine for online black-box optimization. When combined with a novel preference learning loss, we find LLOME can not only learn to solve some Ehrlich functions, but can even perform as well as or better than LaMBO-2 on moderately difficult Ehrlich variants. However, LLMs also exhibit some likelihood-reward miscalibration and struggle without explicit rewards. Our results indicate LLMs can occasionally provide significant benefits, but specialized solvers are still competitive and incur less overhead.