Optimal rates for independence testing via $U$-statistic permutation tests
This work addresses the fundamental statistical challenge of testing independence between variables, providing optimal rates for practitioners in fields like data analysis, though it is incremental in refining existing methods under specific conditions.
The paper tackles the problem of independence testing by showing that no uniformly consistent test exists for general alternatives, then proposes a permutation test based on a U-statistic estimator that achieves minimax optimal separation rates under smoothness constraints, with implementation in an R package.
We study the problem of independence testing given independent and identically distributed pairs taking values in a $σ$-finite, separable measure space. Defining a natural measure of dependence $D(f)$ as the squared $L^2$-distance between a joint density $f$ and the product of its marginals, we first show that there is no valid test of independence that is uniformly consistent against alternatives of the form $\{f: D(f) \geq ρ^2 \}$. We therefore restrict attention to alternatives that impose additional Sobolev-type smoothness constraints, and define a permutation test based on a basis expansion and a $U$-statistic estimator of $D(f)$ that we prove is minimax optimal in terms of its separation rates in many instances. Finally, for the case of a Fourier basis on $[0,1]^2$, we provide an approximation to the power function that offers several additional insights. Our methodology is implemented in the R package USP.