On the Sample Complexity of Rank Regression from Pairwise Comparisons
This work addresses the sample complexity for rank regression, which is incremental as it provides theoretical bounds for a specific learning setting.
The paper tackles the problem of learning model parameters for rank regression from pairwise comparisons, showing that with a dataset of N samples in R^d, it suffices to conduct M in Ω(dN log^3 N/ε^2) comparisons to achieve ε accuracy when N is Ω(d/ε^2).
We consider a rank regression setting, in which a dataset of $N$ samples with features in $\mathbb{R}^d$ is ranked by an oracle via $M$ pairwise comparisons. Specifically, there exists a latent total ordering of the samples; when presented with a pair of samples, a noisy oracle identifies the one ranked higher with respect to the underlying total ordering. A learner observes a dataset of such comparisons and wishes to regress sample ranks from their features. We show that to learn the model parameters with $ε> 0$ accuracy, it suffices to conduct $M \in Ω(dN\log^3 N/ε^2)$ comparisons uniformly at random when $N$ is $Ω(d/ε^2)$.