Measuring Statistical Dependencies via Maximum Norm and Characteristic Functions
This work addresses the challenge of measuring statistical dependencies for machine learning practitioners, offering a tool for tasks like feature extraction and regularization, though it is incremental as it builds on existing characteristic function approaches.
The authors tackled the problem of statistical dependence estimation by proposing a new measure based on the maximum-norm of characteristic function differences, which can detect arbitrary dependencies and is integrable into machine learning pipelines. Experiments showed it handles high-dimensional, non-linear data better than prior methods and improves performance in supervised feature extraction and neural network regularization.
In this paper, we focus on the problem of statistical dependence estimation using characteristic functions. We propose a statistical dependence measure, based on the maximum-norm of the difference between joint and product-marginal characteristic functions. The proposed measure can detect arbitrary statistical dependence between two random vectors of possibly different dimensions, is differentiable, and easily integrable into modern machine learning and deep learning pipelines. We also conduct experiments both with simulated and real data. Our simulations show, that the proposed method can measure statistical dependencies in high-dimensional, non-linear data, and is less affected by the curse of dimensionality, compared to the previous work in this line of research. The experiments with real data demonstrate the potential applicability of our statistical measure for two different empirical inference scenarios, showing statistically significant improvement in the performance characteristics when applied for supervised feature extraction and deep neural network regularization. In addition, we provide a link to the accompanying open-source repository https://bit.ly/3d4ch5I.