A Structured Prediction Approach for Missing Value Imputation
This addresses the problem of missing data imputation with constraints for data scientists, but it is incremental as it builds on existing imputation methods by adding a structured formulation.
The paper tackled missing value imputation by proposing a structured output approach that incorporates domain constraints, resulting in significantly improved performance on the Hamming loss measure compared to state-of-the-art methods.
Missing value imputation is an important practical problem. There is a large body of work on it, but there does not exist any work that formulates the problem in a structured output setting. Also, most applications have constraints on the imputed data, for example on the distribution associated with each variable. None of the existing imputation methods use these constraints. In this paper we propose a structured output approach for missing value imputation that also incorporates domain constraints. We focus on large margin models, but it is easy to extend the ideas to probabilistic models. We deal with the intractable inference step in learning via a piecewise training technique that is simple, efficient, and effective. Comparison with existing state-of-the-art and baseline imputation methods shows that our method gives significantly improved performance on the Hamming loss measure.