LGAug 2, 2022

An Online Sparse Streaming Feature Selection Algorithm

arXiv:2208.01562v23 citationsh-index: 48
Originality Incremental advance
AI Analysis

This work addresses a crucial challenge in intelligent healthcare platforms and similar applications where sparse streaming features with missing data require effective feature selection, though it is incremental as it builds on existing OSFS methods.

The paper tackles the problem of online streaming feature selection with missing data by proposing the OS2FSU algorithm, which uses latent factor analysis to estimate missing values and fuzzy logic with neighborhood rough sets to handle uncertainty, resulting in outperformance over five state-of-the-art algorithms on six real datasets.

Online streaming feature selection (OSFS), which conducts feature selection in an online manner, plays an important role in dealing with high-dimensional data. In many real applications such as intelligent healthcare platform, streaming feature always has some missing data, which raises a crucial challenge in conducting OSFS, i.e., how to establish the uncertain relationship between sparse streaming features and labels. Unfortunately, existing OSFS algorithms never consider such uncertain relationship. To fill this gap, we in this paper propose an online sparse streaming feature selection with uncertainty (OS2FSU) algorithm. OS2FSU consists of two main parts: 1) latent factor analysis is utilized to pre-estimate the missing data in sparse streaming features before con-ducting feature selection, and 2) fuzzy logic and neighborhood rough set are employed to alleviate the uncertainty between estimated streaming features and labels during conducting feature selection. In the experiments, OS2FSU is compared with five state-of-the-art OSFS algorithms on six real datasets. The results demonstrate that OS2FSU outperforms its competitors when missing data are encountered in OSFS.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes