Importance Corrected Neural JKO Sampling
This addresses the challenge of slow convergence and local minima in sampling for multimodal distributions, which is crucial for applications in machine learning and statistics, though it is an incremental improvement over existing flow-based methods.
The paper tackles the problem of sampling from unnormalized multimodal probability densities by combining continuous normalizing flows with rejection-resampling steps, proving convergence to Wasserstein gradient flows and showing that the method outperforms state-of-the-art approaches in most cases with significant gains.
In order to sample from an unnormalized probability density function, we propose to combine continuous normalizing flows (CNFs) with rejection-resampling steps based on importance weights. We relate the iterative training of CNFs with regularized velocity fields to a JKO scheme and prove convergence of the involved velocity fields to the velocity field of the Wasserstein gradient flow (WGF). The alternation of local flow steps and non-local rejection-resampling steps allows to overcome local minima or slow convergence of the WGF for multimodal distributions. Since the proposal of the rejection step is generated by the model itself, they do not suffer from common drawbacks of classical rejection schemes. The arising model can be trained iteratively, reduces the reverse Kullback-Leibler (KL) loss function in each step, allows to generate iid samples and moreover allows for evaluations of the generated underlying density. Numerical examples show that our method yields accurate results on various test distributions including high-dimensional multimodal targets and outperforms the state of the art in almost all cases significantly.