On the identifiability of mixtures of ranking models
This resolves a fundamental theoretical issue in ranking models, which are standard tools in machine learning for applications like recommendation systems, but the result is incremental as it builds on existing algebraic geometry methods.
The paper tackles the open problem of parameter identifiability in mixtures of ranking models, such as Bradley-Terry-Luce and Plackett-Luce, and proves that these mixtures with two components are generically identifiable, meaning parameters can be identified except in pathological cases of measure zero.
Mixtures of ranking models are standard tools for ranking problems. However, even the fundamental question of parameter identifiability is not fully understood: the identifiability of a mixture model with two Bradley-Terry-Luce (BTL) components has remained open. In this work, we show that popular mixtures of ranking models with two components (BTL, multinomial logistic models with slates of size 3, or Plackett-Luce) are generically identifiable, i.e., the ground-truth parameters can be identified except when they are from a pathological subset of measure zero. We provide a framework for verifying the number of solutions in a general family of polynomial systems using algebraic geometry, and apply it to these mixtures of ranking models to establish generic identifiability. The framework can be applied more broadly to other learning models and may be of independent interest.