D-Flow SGLD: Source-Space Posterior Sampling for Scientific Inverse Problems with Flow Matching
This work addresses uncertainty-aware posterior sampling for scientific inverse problems, offering a practical method for researchers in fields like fluid dynamics and data assimilation, though it is incremental as it builds on existing flow matching priors.
The paper tackled the problem of reconstructing high-dimensional physical states from sparse, noisy observations in scientific inverse problems, proposing D-Flow SGLD, a source-space posterior sampling method that enables scalable exploration of the source posterior without retraining, and benchmarked it on tasks like chaotic Kuramoto-Sivashinsky trajectories and wall-bounded turbulence reconstruction to quantify trade-offs in measurement assimilation and fidelity.
Data assimilation and scientific inverse problems require reconstructing high-dimensional physical states from sparse and noisy observations, ideally with uncertainty-aware posterior samples that remain faithful to learned priors and governing physics. While training-free conditional generation is well developed for diffusion models, corresponding conditioning and posterior sampling strategies for Flow Matching (FM) priors remain comparatively under-explored, especially on scientific benchmarks where fidelity must be assessed beyond measurement misfit. In this work, we study training-free conditional generation for scientific inverse problems under FM priors and organize existing inference-time strategies by where measurement information is injected: (i) guided transport dynamics that perturb sampling trajectories using likelihood information, and (ii) source-distribution inference that performs posterior inference over the source variable while keeping the learned transport fixed. Building on the latter, we propose D-Flow SGLD, a source-space posterior sampling method that augments differentiable source inference with preconditioned stochastic gradient Langevin dynamics, enabling scalable exploration of the source posterior induced by new measurement operators without retraining the prior or modifying the learned FM dynamics. We benchmark representative methods from both families on a hierarchy of problems: 2D toy posteriors, chaotic Kuramoto-Sivashinsky trajectories, and wall-bounded turbulence reconstruction. Across these settings, we quantify trade-offs among measurement assimilation, posterior diversity, and physics/statistics fidelity, and establish D-Flow SGLD as a practical FM-compatible posterior sampler for scientific inverse problems.