MLLGDATA-ANMEApr 2, 2019

UAFS: Uncertainty-Aware Feature Selection for Problems with Missing Data

arXiv:1904.01385v3
Originality Incremental advance
AI Analysis

This addresses the challenge of handling missing data in real-world datasets for data scientists and analysts, though it is an incremental improvement by combining feature selection with imputation.

The paper tackles the problem of imputing missing data in high-dimensional datasets by proposing uncertainty-aware feature selection (UAFS) as a preprocessing step, demonstrating improved imputation accuracy and subsequent prediction accuracy across various datasets and missingness levels.

Missing data are a concern in many real world data sets and imputation methods are often needed to estimate the values of missing data, but data sets with excessive missingness and high dimensionality challenge most approaches to imputation. Here we show that appropriate feature selection can be an effective preprocessing step for imputation, allowing for more accurate imputation and subsequent model predictions. The key feature of this preprocessing is that it incorporates uncertainty: by accounting for uncertainty due to missingness when selecting features we can reduce the degree of missingness while also limiting the number of uninformative features being used to make predictive models. We introduce a method to perform uncertainty-aware feature selection (UAFS), provide a theoretical motivation, and test UAFS on both real and synthetic problems, demonstrating that across a variety of data sets and levels of missingness we can improve the accuracy of imputations. Improved imputation due to UAFS also results in improved prediction accuracy when performing supervised learning using these imputed data sets. Our UAFS method is general and can be fruitfully coupled with a variety of imputation methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes