Non asymptotic distributional bounds for the Dickman Approximation of the running time of the Quickselect algorithm
For researchers in algorithm analysis and probability, this gives explicit finite-sample error bounds for the Dickman approximation of Quickselect's running time, with a proven optimal rate for most cases.
The paper provides non-asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm, showing a Wasserstein distance bound of order O(log n / n) for the number of comparisons. This recovers and extends previous asymptotic results.
Given a non-negative random variable $W$ and $θ>0$, let the generalized Dickman transformation map the distribution of $W$ to that of $$ W^*=_d U^{1/θ}(W+1), $$ where $U \sim {\cal U}[0,1]$, a uniformly distributed variable on the unit interval, independent of $W$, and where $=_d$ denotes equality in distribution. It is well known that $W^*$ and $W$ are equal in distribution if and only if $W$ has the generalized Dickman distribution ${\cal D}_θ$. We demonstrate that the Wasserstein distance $d_1$ between $W$, a non-negative random variable with finite mean, and $D_θ$ having distribution ${\cal D}_θ$ obeys the inequality $$ d_1(W,D_θ) \le (1+θ)d_1(W,W^*). $$ The specialization of this bound to the case $θ=1$ and coupling constructions yield $$ d_1(W_{n,1},D) \le \frac{8\log (n/2)+10}{n} \quad \mbox{for all $n \ge 1$, where} \quad W_{n,1}=\frac{1}{n}C_{n,1}-1, $$ and $C_{n,m}$ is the number of comparisons made by the Quickselect algorithm to find the $m^{th}$ smallest element of a list of $n$ distinct numbers. A similar bound holds for $m \ge 2$, and together recover the results of [12] that show distributional convergence of $W_n$ to the standard Dickman distribution in the asymptotic regime $m=o(n)$. By developing an exact expression for the expected running time $E[C_{n,m}]$, lower bounds are provided that show the rate is not improvable for all $m \not = 2$.