Kernel Feature Selection via Conditional Covariance Minimization
This addresses feature selection for data analysis, presenting an incremental improvement over existing kernel dimension reduction methods.
The paper tackles feature selection by proposing a kernel-based method that finds maximally predictive covariate subsets through constrained optimization of conditional covariance operator trace, demonstrating favorable performance against state-of-the-art algorithms on synthetic and real datasets.
We propose a method for feature selection that employs kernel-based measures of independence to find a subset of covariates that is maximally predictive of the response. Building on past work in kernel dimension reduction, we show how to perform feature selection via a constrained optimization problem involving the trace of the conditional covariance operator. We prove various consistency results for this procedure, and also demonstrate that our method compares favorably with other state-of-the-art algorithms on a variety of synthetic and real data sets.