A Bayesian Approach to In-Game Win Probability in Soccer
This addresses a technical problem for soccer analysts and fans by improving model accuracy, though it is incremental as it adapts existing concepts to a specific domain.
The paper tackles the challenge of creating accurate in-game win probability models for soccer, which is hindered by the sport's low-scoring nature, by introducing a Bayesian framework that provides well-calibrated probabilities, as demonstrated on eight seasons of data from top leagues.
In-game win probability models, which provide a sports team's likelihood of winning at each point in a game based on historical observations, are becoming increasingly popular. In baseball, basketball and American football, they have become important tools to enhance fan experience, to evaluate in-game decision-making, and to inform coaching decisions. While equally relevant in soccer, the adoption of these models is held back by technical challenges arising from the low-scoring nature of the sport. In this paper, we introduce an in-game win probability model for soccer that addresses the shortcomings of existing models. First, we demonstrate that in-game win probability models for other sports struggle to provide accurate estimates for soccer, especially towards the end of a game. Second, we introduce a novel Bayesian statistical framework that estimates running win, tie and loss probabilities by leveraging a set of contextual game state features. An empirical evaluation on eight seasons of data for the top-five soccer leagues demonstrates that our framework provides well-calibrated probabilities. Furthermore, two use cases show its ability to enhance fan experience and to evaluate performance in crucial game situations.