Scalable Simulation-Based Model Inference with Test-Time Complexity Control
This addresses the bottleneck of model selection in scientific applications like neuroimaging, offering a scalable and flexible alternative to existing methods, though it is incremental as it builds on amortized inference approaches.
The paper tackles the problem of selecting among large families of plausible simulators in scientific discovery, where classical Bayesian methods are impractical, by introducing PRISM, a simulation-based encoder-decoder that infers a joint posterior over model structures and parameters with test-time complexity control, scaling to billions of model instantiations on a symbolic regression task and demonstrating model selection on biophysical modeling for diffusion MRI data.
Simulation plays a central role in scientific discovery. In many applications, the bottleneck is no longer running a simulator; it is choosing among large families of plausible simulators, each corresponding to different forward models/hypotheses consistent with observations. Over large model families, classical Bayesian workflows for model selection are impractical. Furthermore, amortized model selection methods typically hard-code a fixed model prior or complexity penalty at training time, requiring users to commit to a particular parsimony assumption before seeing the data. We introduce PRISM, a simulation-based encoder-decoder that infers a joint posterior over both discrete model structures and associated continuous parameters, while enabling test-time control of model complexity via a tunable model prior that the network is conditioned on. We show that PRISM scales to families with combinatorially many (up to billions) of model instantiations on a synthetic symbolic regression task. As a scientific application, we evaluate PRISM on biophysical modeling for diffusion MRI data, showing the ability to perform model selection across several multi-compartment models, on both synthetic and in vivo neuroimaging data.