LGSep 27, 2025

Impute-MACFM: Imputation based on Mask-Aware Flow Matching

arXiv:2509.23126v11 citationsh-index: 21
Originality Incremental advance
AI Analysis

This addresses missing-data issues in tabular applications like healthcare, offering a novel method for improved imputation, though it is incremental as it builds on flow matching techniques.

The paper tackled the problem of missing values in tabular data, especially in healthcare, by proposing Impute-MACFM, a mask-aware conditional flow matching framework that achieves state-of-the-art results with more robust, efficient, and higher-quality imputation than competing approaches.

Tabular data are central to many applications, especially longitudinal data in healthcare, where missing values are common, undermining model fidelity and reliability. Prior imputation methods either impose restrictive assumptions or struggle with complex cross-feature structure, while recent generative approaches suffer from instability and costly inference. We propose Impute-MACFM, a mask-aware conditional flow matching framework for tabular imputation that addresses missingness mechanisms, missing completely at random, missing at random, and missing not at random. Its mask-aware objective builds trajectories only on missing entries while constraining predicted velocity to remain near zero on observed entries, using flexible nonlinear schedules. Impute-MACFM combines: (i) stability penalties on observed positions, (ii) consistency regularization enforcing local invariance, and (iii) time-decayed noise injection for numeric features. Inference uses constraint-preserving ordinary differential equation integration with per-step projection to fix observed values, optionally aggregating multiple trajectories for robustness. Across diverse benchmarks, Impute-MACFM achieves state-of-the-art results while delivering more robust, efficient, and higher-quality imputation than competing approaches, establishing flow matching as a promising direction for tabular missing-data problems, including longitudinal data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes