The Randomized Dependence Coefficient
This provides a practical tool for statisticians and data scientists needing efficient dependence measures, though it is incremental as it builds on the Hirschfeld-Gebelein-Rényi Maximum Correlation Coefficient.
The paper tackles the problem of measuring non-linear dependence between high-dimensional random variables by introducing the Randomized Dependence Coefficient (RDC), which achieves invariance to marginal transformations and low computational cost with a simple implementation in just five lines of R code.
We introduce the Randomized Dependence Coefficient (RDC), a measure of non-linear dependence between random variables of arbitrary dimension based on the Hirschfeld-Gebelein-Rényi Maximum Correlation Coefficient. RDC is defined in terms of correlation of random non-linear copula projections; it is invariant with respect to marginal distribution transformations, has low computational cost and is easy to implement: just five lines of R code, included at the end of the paper.