LGDec 22, 2023
Learning Rich RankingsArjun Seshadri, Stephen Ragain, Johan Ugander
Although the foundations of ranking are well established, the ranking literature has primarily been focused on simple, unimodal models, e.g. the Mallows and Plackett-Luce models, that define distributions centered around a single total ordering. Explicit mixture models have provided some tools for modelling multimodal ranking data, though learning such models from data is often difficult. In this work, we contribute a contextual repeated selection (CRS) model that leverages recent advances in choice modeling to bring a natural multimodality and richness to the rankings space. We provide rigorous theoretical guarantees for maximum likelihood estimation under the model through structure-dependent tail risk and expected risk bounds. As a by-product, we also furnish the first tight bounds on the expected risk of maximum likelihood estimators for the multinomial logit (MNL) choice model and the Plackett-Luce (PL) ranking model, as well as the first tail risk bound on the PL ranking model. The CRS model significantly outperforms existing methods for modeling real world ranking data in a variety of settings, from racing to rank choice voting.
LGSep 13, 2018
Choosing to RankStephen Ragain, Johan Ugander
Ranking data arises in a wide variety of application areas but remains difficult to model, learn from, and predict. Datasets often exhibit multimodality, intransitivity, or incomplete rankings---particularly when generated by humans---yet popular probabilistic models are often too rigid to capture such complexities. In this work we leverage recent progress on similar challenges in discrete choice modeling to form flexible and tractable choice-based models for ranking data. We study choice representations, maps from rankings (complete or top-$k$) to collections of choices, as a way of forming ranking models from choice models. We focus on the repeated selection (RS) choice representation, first used to form the Plackett-Luce ranking model from the conditional multinomial logit choice model. We fully characterize, for a prime number of alternatives, the choice representations that admit ranking distributions with unit normalization, a desirably property that greatly simplifies maximum likelihood estimation. We further show that only specific minor variations on repeated selection exhibit this property. Our choice-based ranking models provide higher out-of-sample likelihood when compared to Plackett-Luce and Mallows models on a broad collection of ranking tasks including food preferences, ranked-choice elections, car racing, and search engine relevance tasks.
MLJul 24, 2018
Improving pairwise comparison models using Empirical Bayes shrinkageStephen Ragain, Alexander Peysakhovich, Johan Ugander
Comparison data arises in many important contexts, e.g. shopping, web clicks, or sports competitions. Typically we are given a dataset of comparisons and wish to train a model to make predictions about the outcome of unseen comparisons. In many cases available datasets have relatively few comparisons (e.g. there are only so many NFL games per year) or efficiency is important (e.g. we want to quickly estimate the relative appeal of a product). In such settings it is well known that shrinkage estimators outperform maximum likelihood estimators. A complicating matter is that standard comparison models such as the conditional multinomial logit model are only models of conditional outcomes (who wins) and not of comparisons themselves (who competes). As such, different models of the comparison process lead to different shrinkage estimators. In this work we derive a collection of methods for estimating the pairwise uncertainty of pairwise predictions based on different assumptions about the comparison process. These uncertainty estimates allow us both to examine model uncertainty as well as perform Empirical Bayes shrinkage estimation of the model parameters. We demonstrate that our shrunk estimators outperform standard maximum likelihood methods on real comparison data from online comparison surveys as well as from several sports contexts.
MLMar 8, 2016
Pairwise Choice Markov ChainsStephen Ragain, Johan Ugander
As datasets capturing human choices grow in richness and scale -- particularly in online domains -- there is an increasing need for choice models that escape traditional choice-theoretic axioms such as regularity, stochastic transitivity, and Luce's choice axiom. In this work we introduce the Pairwise Choice Markov Chain (PCMC) model of discrete choice, an inferentially tractable model that does not assume any of the above axioms while still satisfying the foundational axiom of uniform expansion, a considerably weaker assumption than Luce's choice axiom. We show that the PCMC model significantly outperforms the Multinomial Logit (MNL) model in prediction tasks on both synthetic and empirical datasets known to exhibit violations of Luce's axiom. Our analysis also synthesizes several recent observations connecting the Multinomial Logit model and Markov chains; the PCMC model retains the Multinomial Logit model as a special case.