LGAISTMLSep 1, 2021

Nonasymptotic one-and two-sample tests in high dimension with unknown covariance structure

arXiv:2109.01730v23 citations
AI Analysis

This addresses statistical inference challenges in high-dimensional data analysis for researchers, providing theoretical guarantees but is incremental as it builds on prior work like Baraud (2002).

The paper tackles the problem of nonasymptotic hypothesis testing for one- and two-sample mean closeness in high-dimensional settings with unknown covariance, deriving upper and lower bounds on the minimal separation distance to control Type I and Type II errors, with results showing a separation distance of Θ(d_*^{1/4}√(‖Σ‖_∞/n)) for η=0.

Let $\mathbf{X} = (X_i)_{1\leq i \leq n}$ be an i.i.d. sample of square-integrable variables in $\mathbb{R}^d$, \GB{with common expectation $μ$ and covariance matrix $Σ$, both unknown.} We consider the problem of testing if $μ$ is $η$-close to zero, i.e. $\|μ\| \leq η$ against $\|μ\| \geq (η+ δ)$; we also tackle the more general two-sample mean closeness (also known as {\em relevant difference}) testing problem. The aim of this paper is to obtain nonasymptotic upper and lower bounds on the minimal separation distance $δ$ such that we can control both the Type I and Type II errors at a given level. The main technical tools are concentration inequalities, first for a suitable estimator of $\|μ\|^2$ used a test statistic, and secondly for estimating the operator and Frobenius norms of $Σ$ coming into the quantiles of said test statistic. These properties are obtained for Gaussian and bounded distributions. A particular attention is given to the dependence in the pseudo-dimension $d_*$ of the distribution, defined as $d_* := \|Σ\|_2^2/\|Σ\|_\infty^2$. In particular, for $η=0$, the minimum separation distance is $Θ( d_*^{\frac{1}{4}}\sqrt{\|Σ\|_\infty/n})$, in contrast with the minimax estimation distance for $μ$, which is $Θ(d_e^{\frac{1}{2}}\sqrt{\|Σ\|_\infty/n})$ (where $d_e:=\|Σ\|_1/\|Σ\|_\infty$). This generalizes a phenomenon spelled out in particular by Baraud (2002).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes