On A Mallows-type Model For (Ranked) Choices
This provides a practical solution for preference learning in scenarios with ranked choices and varying display sets, though it is an incremental extension of Mallows-type models.
The paper tackles the problem of modeling population preferences from ranked choice data where participants select ordered lists from varying display sets, introducing a distance-based ranking model using a new Reverse Major Index (RMJ) distance. The result is a model with simple closed-form choice probabilities and effective parameter estimation methods that show strong generalization power in real data, particularly with limited display set variety.
We consider a preference learning setting where every participant chooses an ordered list of $k$ most preferred items among a displayed set of candidates. (The set can be different for every participant.) We identify a distance-based ranking model for the population's preferences and their (ranked) choice behavior. The ranking model resembles the Mallows model but uses a new distance function called Reverse Major Index (RMJ). We find that despite the need to sum over all permutations, the RMJ-based ranking distribution aggregates into (ranked) choice probabilities with simple closed-form expression. We develop effective methods to estimate the model parameters and showcase their generalization power using real data, especially when there is a limited variety of display sets.