Tokenised Flow Matching for Hierarchical Simulation Based Inference
This work addresses efficiency bottlenecks in hierarchical SBI for applications like infectious disease modeling and computational fluid dynamics, though it is incremental as it builds on existing likelihood factorisation approaches.
The authors tackled the high computational cost of simulator evaluations in hierarchical Simulation Based Inference (SBI) by proposing Tokenised Flow Matching for Posterior Estimation (TFMPE), which uses likelihood factorisation to train from single-site simulations and reduces computational cost while producing well-calibrated posteriors.
The cost of simulator evaluations is a key practical bottleneck for Simulation Based Inference (SBI). In hierarchical settings with shared global parameters and exchangeable site-level parameters and observations, this structure can be exploited to improve simulation efficiency. Existing hierarchical SBI approaches factorise the posterior yet still simulate across multiple sites per training sample; We instead explore likelihood factorisation (LF) to train from single-site simulations. In LF sampling we learn a per-site neural surrogate of the simulator and then assemble synthetic multi-site observations to amortise inference for the full hierarchical posterior. Building on this, we propose Tokenised Flow Matching for Posterior Estimation (TFMPE), a tokenised flow matching approach that supports function-valued observations through likelihood factorisation. To enable systematic evaluation, we introduce a benchmark for hierarchical SBI. We validate TFMPE on this benchmark and on realistic infectious disease and computational fluid dynamics models, finding well-calibrated posteriors while reducing computational cost.