LG DS MLJul 2, 2017

Dimensionality reduction with missing values imputation

Rania Mkhinini Gahar, Olfa Arfaoui, Minyar Sassi Hidri, Nejib Ben-Hadj Alouane

arXiv:1707.00351v1

Originality Synthesis-oriented

AI Analysis

This addresses data exploration challenges in machine learning, but it is incremental as it combines existing methods.

The paper tackles high-dimensional data reduction and missing value imputation by combining dimensionality reduction with Random Forest imputation, showing efficiency in experiments on public datasets.

In this study, we propose a new statical approach for high-dimensionality reduction of heterogenous data that limits the curse of dimensionality and deals with missing values. To handle these latter, we propose to use the Random Forest imputation's method. The main purpose here is to extract useful information and so reducing the search space to facilitate the data exploration process. Several illustrative numeric examples, using data coming from publicly available machine learning repositories are also included. The experimental component of the study shows the efficiency of the proposed analytical approach.

View on arXiv PDF

Similar