SIIRSOC-PHDec 11, 2020

Limits of PageRank-based ranking methods in sports data

arXiv:2012.06366v13 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of effectively ranking sports teams for sports analysts and enthusiasts, demonstrating that PageRank's utility is limited and often inferior to simpler methods.

This paper investigates the effectiveness of PageRank for ranking sports teams, finding that it only outperforms simpler methods like ranking by number of wins when a small fraction of games have been played. Increased data randomness further diminishes PageRank's advantage. The authors propose a new PageRank variant that performs better but shares the same sensitivity to randomness.

While PageRank has been extensively used to rank sport tournament participants (teams or individuals), its superiority over simpler ranking methods has been never clearly demonstrated. We use sports results from 18 major leagues to calibrate a state-of-art model for synthetic sports results. Model data are then used to assess the ranking performance of PageRank in a controlled setting. We find that PageRank outperforms the benchmark ranking by the number of wins only when a small fraction of all games have been played. Increased randomness in the data, such as intrinsic randomness of outcomes or advantage of home teams, further reduces the range of PageRank's superiority. We propose a new PageRank variant which outperforms PageRank in all evaluated settings, yet shares its sensitivity to increased randomness in the data. Our main findings are confirmed by evaluating the ranking algorithms on real data. Our work demonstrates the danger of using novel metrics and algorithms without considering their limits of applicability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes