Supervised Dimensionality Reduction via Distance Correlation Maximization
This work addresses the problem of improving feature representation for predictive modeling, offering a method that is distribution-free and model-agnostic, though it appears incremental as it builds on existing dependency criteria and optimization techniques.
The paper tackles supervised dimensionality reduction by proposing a novel formulation based on maximizing distance correlation between low-dimensional features and both responses and covariates, achieving superior empirical results over state-of-the-art methods on multiple datasets.
In our work, we propose a novel formulation for supervised dimensionality reduction based on a nonlinear dependency criterion called Statistical Distance Correlation, Szekely et. al. (2007). We propose an objective which is free of distributional assumptions on regression variables and regression model assumptions. Our proposed formulation is based on learning a low-dimensional feature representation $\mathbf{z}$, which maximizes the squared sum of Distance Correlations between low dimensional features $\mathbf{z}$ and response $y$, and also between features $\mathbf{z}$ and covariates $\mathbf{x}$. We propose a novel algorithm to optimize our proposed objective using the Generalized Minimization Maximizaiton method of \Parizi et. al. (2015). We show superior empirical results on multiple datasets proving the effectiveness of our proposed approach over several relevant state-of-the-art supervised dimensionality reduction methods.