CVLGMLApr 18, 2017

Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality

arXiv:1704.05409v158 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of reducing dimensionality in large datasets for machine learning applications, representing an incremental improvement over existing methods.

The paper tackles the problem of high-dimensional data by proposing a graph-based feature selection method using eigenvector centrality to rank features, achieving remarkable accuracy, stability, and low execution time across 7 diverse datasets.

In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph-where features are the nodes-the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data and object recognition, among others), and compared against filter, embedded and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes