ML LGJul 22, 2025

CoLT: The conditional localization test for assessing the accuracy of neural posterior estimates

Tianyu Chen, Vansh Bansal, James G. Scott

arXiv:2507.17030v14.5h-index: 2

Originality Highly original

AI Analysis

This addresses the need for accurate validation of neural posterior estimates in simulation-based inference, offering a scalable and principled solution for researchers and practitioners in fields like computational statistics and machine learning.

The paper tackles the problem of validating neural posterior estimates in simulation-based inference by introducing the Conditional Localization Test (CoLT), which detects discrepancies between true and estimated posteriors across all conditioning inputs, demonstrating better performance than existing methods and providing actionable insights for model refinement.

We consider the problem of validating whether a neural posterior estimate $ q(θ\mid x) $ is an accurate approximation to the true, unknown true posterior $ p(θ\mid x) $. Existing methods for evaluating the quality of an NPE estimate are largely derived from classifier-based tests or divergence measures, but these suffer from several practical drawbacks. As an alternative, we introduce the \emph{Conditional Localization Test} (CoLT), a principled method designed to detect discrepancies between $ p(θ\mid x) $ and $ q(θ\mid x) $ across the full range of conditioning inputs. Rather than relying on exhaustive comparisons or density estimation at every $ x $, CoLT learns a localization function that adaptively selects points $θ_l(x)$ where the neural posterior $q$ deviates most strongly from the true posterior $p$ for that $x$. This approach is particularly advantageous in typical simulation-based inference settings, where only a single draw $ θ\sim p(θ\mid x) $ from the true posterior is observed for each conditioning input, but where the neural posterior $ q(θ\mid x) $ can be sampled an arbitrary number of times. Our theoretical results establish necessary and sufficient conditions for assessing distributional equality across all $ x $, offering both rigorous guarantees and practical scalability. Empirically, we demonstrate that CoLT not only performs better than existing methods at comparing $p$ and $q$, but also pinpoints regions of significant divergence, providing actionable insights for model refinement. These properties position CoLT as a state-of-the-art solution for validating neural posterior estimates.

View on arXiv PDF

Similar