LG MLJan 13, 2019

Gradient Boosted Feature Selection

Zhixiang Eddie Xu, Gao Huang, Kilian Q. Weinberger, Alice X. Zheng

arXiv:1901.04055v113.4203 citations

Originality Incremental advance

AI Analysis

This work addresses feature selection for machine learning practitioners, offering a scalable and flexible solution, though it appears incremental as it builds on Gradient Boosted Trees.

The authors tackled the problem of feature selection by proposing Gradient Boosted Feature Selection (GBFS), a novel algorithm that meets four key criteria, including reliability, non-linear interaction identification, scalability, and sparsity incorporation, and demonstrated that it matches or outperforms other state-of-the-art methods on real-world datasets.

A feature selection algorithm should ideally satisfy four conditions: reliably extract relevant features; be able to identify non-linear feature interactions; scale linearly with the number of features and dimensions; allow the incorporation of known sparsity structure. In this work we propose a novel feature selection algorithm, Gradient Boosted Feature Selection (GBFS), which satisfies all four of these requirements. The algorithm is flexible, scalable, and surprisingly straight-forward to implement as it is based on a modification of Gradient Boosted Trees. We evaluate GBFS on several real world data sets and show that it matches or out-performs other state of the art feature selection algorithms. Yet it scales to larger data set sizes and naturally allows for domain-specific side information.

View on arXiv PDF

Similar