LG CYFeb 28, 2022

Fast Feature Selection with Fairness Constraints

Francesco Quinzan, Rajiv Khanna, Moshik Hershcovitch, Sarel Cohen, Daniel G. Waddington, Tobias Friedrich, Michael W. Mahoney

arXiv:2202.13718v25.85 citations

Originality Incremental advance

AI Analysis

This work addresses feature selection efficiency and fairness for machine learning practitioners, representing an incremental advancement by building on existing adaptive query and OMP methods.

The paper tackles the computationally challenging problem of optimal feature selection on large datasets by extending the adaptive query model to Orthogonal Matching Pursuit for non-submodular functions, achieving exponentially fast parallel run time and incorporating fairness constraints with strong approximation guarantees.

We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.

View on arXiv PDF

Similar