LGAIMLMay 22, 2019

Generative Imputation and Stochastic Prediction

arXiv:1905.09340v46 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of handling missing data and uncertainties in classification tasks, but it appears incremental as it builds on existing generative imputation techniques.

The paper tackles the problem of incomplete datasets in machine learning by proposing a method for imputing missing features and estimating target class uncertainties, showing effectiveness on image and tabular datasets under various missingness conditions.

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is synonymous with uncertainties not only over the distribution of missing values but also over target class assignments that require careful consideration. In this paper, we propose a simple and effective method for imputing missing features and estimating the distribution of target assignments given incomplete data. In order to make imputations, we train a simple and effective generator network to generate imputations that a discriminator network is tasked to distinguish. Following this, a predictor network is trained using the imputed samples from the generator network to capture the classification uncertainties and make predictions accordingly. The proposed method is evaluated on CIFAR-10 and MNIST image datasets as well as five real-world tabular classification datasets, under different missingness rates and structures. Our experimental results show the effectiveness of the proposed method in generating imputations as well as providing estimates for the class uncertainties in a classification task when faced with missing values.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes