Concentric mixtures of Mallows models for top-$k$ rankings: sampling and identifiability
This addresses the challenge of aggregating rankings from mixed expert and non-expert groups, which is incremental as it builds on existing Mallows models and Borda algorithms for specific ranking scenarios.
The paper tackles the problem of analyzing heterogeneous populations of voters with mixtures of two Mallows models for top-k rankings, where one group consists of experts and the other of non-experts, by proposing efficient sampling algorithms and showing identifiability and learnability of parameters, with adaptations to recover a consensus ranking consistent with expert rankings.
In this paper, we consider mixtures of two Mallows models for top-$k$ rankings, both with the same location parameter but with different scale parameters, i.e., a mixture of concentric Mallows models. This situation arises when we have a heterogeneous population of voters formed by two homogeneous populations, one of which is a subpopulation of expert voters while the other includes the non-expert voters. We propose efficient sampling algorithms for Mallows top-$k$ rankings. We show the identifiability of both components, and the learnability of their respective parameters in this setting by, first, bounding the sample complexity for the Borda algorithm with top-$k$ rankings and second, proposing polynomial time algorithm for the separation of the rankings in each component. Finally, since the rank aggregation will suffer from a large amount of noise introduced by the non-expert voters, we adapt the Borda algorithm to be able to recover the ground truth consensus ranking which is especially consistent with the expert rankings.