A Differentiable Rank-Based Objective For Better Feature Learning
This work addresses the challenge of feature learning for machine learning practitioners by providing a more flexible and trainable method, though it is incremental as it builds on an existing statistical technique.
The paper tackled the problem of improving feature learning by introducing difFOCI, a differentiable approximation of the FOCI variable selection method, which enhances applicability and performance in machine learning tasks. The result included better management of spurious correlations and demonstrated utility in variable selection, neural network regularization, and fairness applications, with evaluations on toy examples to convolutional networks.
In this paper, we leverage existing statistical methods to better understand feature learning from data. We tackle this by modifying the model-free variable selection method, Feature Ordering by Conditional Independence (FOCI), which is introduced in \cite{azadkia2021simple}. While FOCI is based on a non-parametric coefficient of conditional dependence, we introduce its parametric, differentiable approximation. With this approximate coefficient of correlation, we present a new algorithm called difFOCI, which is applicable to a wider range of machine learning problems thanks to its differentiable nature and learnable parameters. We present difFOCI in three contexts: (1) as a variable selection method with baseline comparisons to FOCI, (2) as a trainable model parametrized with a neural network, and (3) as a generic, widely applicable neural network regularizer, one that improves feature learning with better management of spurious correlations. We evaluate difFOCI on increasingly complex problems ranging from basic variable selection in toy examples to saliency map comparisons in convolutional networks. We then show how difFOCI can be incorporated in the context of fairness to facilitate classifications without relying on sensitive data.