MLLGNov 1, 2018

HMLasso: Lasso with High Missing Rate

arXiv:1811.00255v41 citations
Originality Incremental advance
AI Analysis

This addresses a practical bottleneck in sparse regression for data with many missing values, but it is incremental as it builds on CoCoLasso.

The paper tackles the problem of high missing rates in high-dimensional data for Lasso regression, proposing HMLasso which modifies CoCoLasso with weighted mean imputed covariance to reduce bias, showing high effectiveness in experiments.

Sparse regression such as the Lasso has achieved great success in handling high-dimensional data. However, one of the biggest practical problems is that high-dimensional data often contain large amounts of missing values. Convex Conditioned Lasso (CoCoLasso) has been proposed for dealing with high-dimensional data with missing values, but it performs poorly when there are many missing values, so that the high missing rate problem has not been resolved. In this paper, we propose a novel Lasso-type regression method for high-dimensional data with high missing rates. We effectively incorporate mean imputed covariance, overcoming its inherent estimation bias. The result is an optimally weighted modification of CoCoLasso according to missing ratios. We theoretically and experimentally show that our proposed method is highly effective even when there are many missing values.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes