Non-asymptotic bounds for percentiles of independent non-identical random variables
This provides theoretical insights for statisticians and data scientists working with heterogeneous data, but it is incremental as it extends known results to non-identical cases.
The paper tackles the problem of deriving non-asymptotic bounds for percentiles of independent non-identical random variables, discovering a connection between the median and the harmonic mean of standard deviations for a class of distributions, with a specific bound for Gaussian variables scaling as O_P(n^{1/2} * (sum σ_k^{-1})^{-1}).
This note displays an interesting phenomenon for percentiles of independent but non-identical random variables. Let $X_1,\cdots,X_n$ be independent random variables obeying non-identical continuous distributions and $X^{(1)}\geq \cdots\geq X^{(n)}$ be the corresponding order statistics. For any $p\in(0,1)$, we investigate the $100(1-p)$%-th percentile $X^{(pn)}$ and prove non-asymptotic bounds for $X^{(pn)}$. In particular, for a wide class of distributions, we discover an intriguing connection between their median and the harmonic mean of the associated standard deviations. For example, if $X_k\sim\mathcal{N}(0,σ_k^2)$ for $k=1,\cdots,n$ and $p=\frac{1}{2}$, we show that its median $\big|{\rm Med}\big(X_1,\cdots,X_n\big)\big|= O_P\Big(n^{1/2}\cdot\big(\sum_{k=1}^nσ_k^{-1}\big)^{-1}\Big)$ as long as $\{σ_k\}_{k=1}^n$ satisfy certain mild non-dispersion property.