LGAIAug 25, 2025

Amortized Sampling with Transferable Normalizing Flows

arXiv:2508.18175v113 citationsh-index: 15Has Code
Originality Highly original
AI Analysis

This work addresses the problem of amortized sampling in computational chemistry, enabling faster and more scalable sampling for researchers, though it is incremental as it builds on existing generative models.

The paper tackles the challenge of efficient equilibrium sampling of molecular conformations by introducing Prose, a transferable normalizing flow that draws zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving transferability across sequence length and outperforming established methods like sequential Monte Carlo on unseen tetrapeptides.

Efficient equilibrium sampling of molecular conformations remains a core challenge in computational chemistry and statistical inference. Classical approaches such as molecular dynamics or Markov chain Monte Carlo inherently lack amortization; the computational cost of sampling must be paid in-full for each system of interest. The widespread success of generative models has inspired interest into overcoming this limitation through learning sampling algorithms. Despite performing on par with conventional methods when trained on a single system, learned samplers have so far demonstrated limited ability to transfer across systems. We prove that deep learning enables the design of scalable and transferable samplers by introducing Prose, a 280 million parameter all-atom transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length. Prose draws zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving the previously intractable transferability across sequence length, whilst retaining the efficient likelihood evaluation of normalizing flows. Through extensive empirical evaluation we demonstrate the efficacy of Prose as a proposal for a variety of sampling algorithms, finding a simple importance sampling-based finetuning procedure to achieve superior performance to established methods such as sequential Monte Carlo on unseen tetrapeptides. We open-source the Prose codebase, model weights, and training dataset, to further stimulate research into amortized sampling methods and finetuning objectives.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes