Controlling Fairness and Bias in Dynamic Learning-to-Rank
This addresses fairness issues in two-sided online platforms like news or music services, offering a novel approach to balance user utility and provider exposure, though it builds incrementally on existing fairness-aware ranking methods.
The paper tackles the problem of unfairness to item providers in learning-to-rank systems by proposing an algorithm that enforces merit-based fairness guarantees for groups while learning from implicit feedback, achieving practical and robust performance with theoretical convergence guarantees.
Rankings are the primary interface through which many online platforms match users to items (e.g. news, products, music, video). In these two-sided markets, not only the users draw utility from the rankings, but the rankings also determine the utility (e.g. exposure, revenue) for the item providers (e.g. publishers, sellers, artists, studios). It has already been noted that myopically optimizing utility to the users, as done by virtually all learning-to-rank algorithms, can be unfair to the item providers. We, therefore, present a learning-to-rank approach for explicitly enforcing merit-based fairness guarantees to groups of items (e.g. articles by the same publisher, tracks by the same artist). In particular, we propose a learning algorithm that ensures notions of amortized group fairness, while simultaneously learning the ranking function from implicit feedback data. The algorithm takes the form of a controller that integrates unbiased estimators for both fairness and utility, dynamically adapting both as more data becomes available. In addition to its rigorous theoretical foundation and convergence guarantees, we find empirically that the algorithm is highly practical and robust.