LGDec 22, 2022
Stochastic analysis of the Elo rating algorithm in round-robin tournamentsDaniel Gomes de Pinho Zanco, Leszek Szczecinski, Eduardo Vinicius Kuhn et al.
The Elo algorithm, renowned for its simplicity, is widely used for rating in sports tournaments and other applications. However, despite its widespread use, a detailed understanding of the convergence characteristics of the Elo algorithm is still lacking. Aiming to fill this gap, this paper presents a comprehensive (stochastic) analysis of the Elo algorithm, considering round-robin tournaments. Specifically, analytical expressions are derived describing the evolution of the skills and performance metrics. Then, taking into account the relationship between the behavior of the algorithm and the step-size value, which is a hyperparameter that can be controlled, design guidelines and discussions about the performance of the algorithm are provided. Experimental results are shown confirming the accuracy of the analysis and illustrating the applicability of the theoretical findings using real-world data obtained from SuperLega, the Italian volleyball league.
LGAug 8, 2022
Rankability and Linear Ordering Problem: New Probabilistic Insight and AlgorithmsLeszek Szczecinski, Harsh Sukheja
The linear ordering problem (LOP), which consists in ordering M objects from their pairwise comparisons, is commonly applied in many areas of research. While efforts have been made to devise efficient LOP algorithms, verification of whether the data are rankable, that is, if the linear ordering problem (LOP) solutions have a meaningful interpretation, received much less attention. To address this problem, we adopt a probabilistic perspective where the results of pairwise comparisons are modeled as Bernoulli variables with a common parameter and we estimate the latter from the observed data. The brute-force approach to the required enumeration has a prohibitive complexity of O(M !), so we reformulate the problem and introduce a concept of the Slater spectrum that generalizes the Slater index, and then devise an algorithm to find the spectrum with complexity O(M^3 2^M) that is manageable for moderate values of M. Furthermore, with a minor modification of the algorithm, we are able to find all solutions of the LOP with the complexity O(M 2^M). Numerical examples are shown on synthetic and real-world data, and the algorithms are publicly available.
LGAug 2, 2024
FIVB ranking: Misstep in the right directionSalma Tenni, Daniel Gomes de Pinho Zanco, Leszek Szczecinski
This work presents and evaluates the ranking algorithm that has been used by Federation Internationale de Volleyball (FIVB) since 2020. The prominent feature of the FIVB ranking is the use of the probabilistic model, which explicitly calculates the probabilities of the future matches results using the estimated teams' strengths. Such explicit modeling is new in the context of official sport rankings, especially for multi-level outcomes, and we study the optimality of its parameters using both analytical and numerical methods. We conclude that from the modeling perspective, the current thresholds fit well the data but adding the home-field advantage (HFA) would be beneficial. Regarding the algorithm itself, we explain the rationale behind the approximations currently used and show a simple method to find new parameters (numerical score) which improve the performance. We also show that the weighting of the match results is counterproductive.
LGDec 16, 2025
Low-rank MMSE filters, Kronecker-product representation, and regularization: a new perspectiveDaniel Gomes de Pinho Zanco, Leszek Szczecinski, Jacob Benesty et al.
In this work, we propose a method to efficiently find the regularization parameter for low-rank MMSE filters based on a Kronecker-product representation. We show that the regularization parameter is surprisingly linked to the problem of rank selection and, thus, properly choosing it, is crucial for low-rank settings. The proposed method is validated through simulations, showing significant gains over commonly used methods.
25.4MEApr 4
New insights into Elo algorithm for practitioners and statisticiansLeszek Szczecinski
This work reconciles two perspectives on the Elo ranking that coexist in the literature: the practitioner's view as a heuristic feedback rule, and the statistician's view as online maximum likelihood estimation via stochastic gradient ascent. Both perspectives coincide exactly in the binary case (iff the expected score is the logistic function). However, estimation noise forces a principled decoupling between the model used for ranking and the model used for prediction: the effective scale and home-field advantage parameter must be adjusted to account for the noise. We provide both closed-form corrections and a data-driven identification procedure. For multilevel outcomes, an exact relationship exists when outcome scores are uniformly spaced, but approximations are preferred in general: they account for estimation noise and better fit the data. The decoupled approach substantially outperforms the conventional one that reuses the ranking model for prediction, and serves as a diagnostic of convergence status. Applied to six years of FIFA men's ranking, we find that the ranking had not converged for the vast majority of national teams. The paper is written in a semi-tutorial style accessible to practitioners, with all key results accompanied by closed-form expressions and numerical examples.
ITDec 11, 2023
Automatic Regularization for Linear MMSE FiltersDaniel Gomes de Pinho Zanco, Leszek Szczecinski, Jacob Benesty
In this work, we consider the problem of regularization in the design of minimum mean square error (MMSE) linear filters. Using the relationship with statistical machine learning methods, using a Bayesian approach, the regularization parameter is found from the observed signals in a simple and automatic manner. The proposed approach is illustrated in system identification and beamforming examples, where the automatic regularization is shown to yield near-optimal results.
IRDec 20, 2021
FIFA ranking: Evaluation and path forwardLeszek Szczecinski, Iris-Ioana Roatis
In this work we study the ranking algorithm used by Fédération Internationale de Football Association (FIFA); we analyze the parameters it currently uses, show the formal probabilistic model from which it can be derived, and optimize the latter. In particular, analyzing the games since the introduction of the algorithm in 2018, we conclude that the game's "importance" (as defined by FIFA) used in the algorithm is counterproductive from the point of view of the predictive capability of the algorithm. We also postulate the algorithm to be rooted in the formal modelling principle, where the Davidson model proposed in 1970 seems to be an excellent candidate, preserving the form of the algorithm currently used. The results indicate that the predictive capability of the algorithm is notably improved by using the home-field advantage and the explicit model for the draws in the game. Moderate, but notable improvement may be attained by introducing the weighting of the results with the goal differential, which although not rooted in a formal modelling principle, is compatible with the current algorithm and can be tuned to the characteristics of the football competition.
MLApr 28, 2021
Simplified Kalman filter for online rating: one-fits-all approachLeszek Szczecinski, Raphaëlle Tihon
In this work, we deal with the problem of rating in sports, where the skills of the players/teams are inferred from the observed outcomes of the games. Our focus is on the online rating algorithms which estimate the skills after each new game by exploiting the probabilistic models of the relationship between the skills and the game outcome. We propose a Bayesian approach which may be seen as an approximate Kalman filter and which is generic in the sense that it can be used with any skills-outcome model and can be applied in the individual -- as well as in the group-sports. We show how the well-know algorithms (such as the Elo, the Glicko, and the TrueSkill algorithms) may be seen as instances of the one-fits-all approach we propose. In order to clarify the conditions under which the gains of the Bayesian approach over the simpler solutions can actually materialize, we critically compare the known and the new algorithms by means of numerical examples using the synthetic as well as the empirical data.
MEOct 20, 2020
G-Elo: Generalization of the Elo algorithm by modelling the discretized margin of victoryLeszek Szczecinski
In this work we develop a new algorithm for rating of teams (or players) in one-on-one games by exploiting the observed difference of the game-points (such as goals), also known as a margin of victory (MOV). Our objective is to obtain the Elo-style algorithm whose operation is simple to implement and to understand intuitively. This is done in three steps: first, we define the probabilistic model between the teams' skills and the discretized MOV variable: this generalizes the model underpinning the Elo algorithm, where the MOV variable is discretized into three categories (win/loss/draw). Second, with the formal probabilistic model at hand, the optimization required by the maximum likelihood rule is implemented via stochastic gradient; this yields simple on-line equations for the rating updates which are identical in their general form to those characteristic of the Elo algorithm: the main difference lies in the way the scores and the expected scores are defined. Third, we propose a simple method to estimate the coefficients of the model, and thus define the operation of the algorithm; it is done in a closed form using the historical data so the algorithm is tailored to the sport of interest and the coefficients defining its operation are determined in entirely transparent manner. The alternative, optimization-based strategy to find the coefficients is also presented. We show numerical examples based on the results of the association football of the English Premier League and the American football of the National Football League.
CVDec 6, 2019
Bilinear Models for Machine LearningTayssir Doghri, Leszek Szczecinski, Jacob Benesty et al.
In this work we define and analyze the bilinear models which replace the conventional linear operation used in many building blocks of machine learning (ML). The main idea is to devise the ML algorithms which are adapted to the objects they treat. In the case of monochromatic images, we show that the bilinear operation exploits better the structure of the image than the conventional linear operation which ignores the spatial relationship between the pixels. This translates into significantly smaller number of parameters required to yield the same performance. We show numerical examples of classification in the MNIST data set.