IRSep 2, 2020

Comparing Fair Ranking Metrics

arXiv:2009.01311v213.432 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of selecting appropriate fairness metrics for IR practitioners, but it is incremental as it focuses on comparison rather than introducing new metrics.

The paper tackled the lack of comparative analysis of fair ranking metrics in information retrieval by describing existing metrics in a common notation and empirically comparing them on multiple datasets for search and recommendation tasks, finding limited guidance on applicability and agreement among metrics.

Ranked lists are frequently used by information retrieval (IR) systems to present results believed to be relevant to the users information need. Fairness is a relatively new but important aspect of these rankings to measure, joining a rich set of metrics that go beyond traditional accuracy or utility constructs to provide a more holistic understanding of IR system behavior. In the last few years, several metrics have been proposed to quantify the (un)fairness of rankings, particularly with respect to particular group(s) of content providers, but comparative analyses of these metrics -- particularly for IR -- is lacking. There is limited guidance, therefore, to decide what fairness metrics are applicable to a specific scenario, or assessment of the extent to which metrics agree or disagree applied to real data. In this paper, we describe several fair ranking metrics from existing literature in a common notation, enabling direct comparison of their assumptions, goals, and design choices; we then empirically compare them on multiple data sets covering both search and recommendation tasks.

View on arXiv PDF

Similar