ML LG MEMay 31, 2025

Score Matching With Missing Data

arXiv:2506.00557v114.05 citationsh-index: 2ICML

Originality Incremental advance

AI Analysis

This work addresses a gap in score matching for missing data, which is incremental but important for applications like diffusion processes and graphical models.

The paper tackled the problem of applying score matching to incomplete data, where data can be missing across any coordinates, by developing two variations: an importance weighting approach with strong performance in low-dimensional, small-sample cases, and a variational approach that excels in high-dimensional settings, as demonstrated on graphical model estimation tasks.

Score matching is a vital tool for learning the distribution of data with applications across many areas including diffusion processes, energy based modelling, and graphical model estimation. Despite all these applications, little work explores its use when data is incomplete. We address this by adapting score matching (and its major extensions) to work with missing data in a flexible setting where data can be partially missing over any subset of the coordinates. We provide two separate score matching variations for general use, an importance weighting (IW) approach, and a variational approach. We provide finite sample bounds for our IW approach in finite domain settings and show it to have especially strong performance in small sample lower dimensional cases. Complementing this, we show our variational approach to be strongest in more complex high-dimensional settings which we demonstrate on graphical model estimation tasks on both real and simulated data.

View on arXiv PDF

Similar