40.2LGJun 2Code
How Many Trees in a Random Forest? A Revisited Approach with Plateau Search and Optuna IntegrationVadim Porvatov, Andrey Dukhovny, Andrey Lange
Hyperparameter optimization (HPO) for Random Forest faces a specific difficulty in tuning the number of trees: the predictive score typically improves monotonically with ensemble size, so standard methods such as Tree-structured Parzen Estimator (TPE) and Hyperband require a predefined search range and often drive the estimate toward its right boundary. Early-stopping strategies avoid fixing such a range, but can be sensitive to score noise and prone to premature stopping. To address this, we propose an integrated triplet-based plateau-search algorithm that removes the number of trees from the direct TPE search space and still exploits information accumulated across HPO trials. The method adaptively tracks a near-minimal sufficient ensemble size by monitoring relative changes in the out-of-bag (OOB) score across a triplet of forest sizes and shifting this triplet accordingly. This yields an automated and user-interpretable procedure based on a tolerance parameter. We also provide a theoretical analysis: we relate the proposed relative OOB-score criterion to the gap between the current and limiting scores, and derive an asymptotic variance estimate for the corresponding OOB-based absolute relative difference. Experiments show that the selected number of trees can differ substantially from the common heuristic: for most classical benchmark datasets it is smaller, whereas for some high-dimensional bioinformatics datasets, such as Arcene and Dorothea, it is larger. The source code and reproducible experiments are available at https://github.com/lange-am/rf_plateau_hpo.
HCAug 18, 2019
Sensors and Game Synchronization for Data Analysis in eSportsAnton Stepanov, Andrey Lange, Nikita Khromov et al.
eSports industry has greatly progressed within the last decade in terms of audience and fund rising, broadcasting, networking and hardware. Since the number and quality of professional team has evolved too, there is a reasonable need in improving skills and training process of professional eSports athletes. In this work, we demonstrate a system able to collect heterogeneous data (physiological, environmental, video, telemetry) and guarantying synchronization with 10 ms accuracy. In particular, we demonstrate how to synchronize various sensors and ensure post synchronization, i.e. logged video, a so-called demo file, with the sensors data. Our experimental results achieved on the CS:GO game discipline show up to 3 ms accuracy of the time synchronization of the gaming computer.
HCAug 18, 2019
Towards Understanding of eSports Athletes' Potentialities: The Sensing System for Data Collection and AnalysisAlexander Korotin, Nikita Khromov, Anton Stepanov et al.
eSports is a developing multidisciplinary research area. At present, there is a lack of relevant data collected from real eSports athletes and lack of platforms which could be used for the data collection and further analysis. In this paper, we present a sensing system for enabling the data collection from professional athletes. Also, we report on the case study about collecting and analyzing the gaze data from Monolith professional eSports team specializing in Counter-Strike: Global Offensive (CS:GO) discipline. We perform a comparative study on assessing the gaze of amateur players and professional athletes. The results of our work are vital for ensuring eSports data collection and the following analysis in the scope of scouting or assessing the eSports players and athletes.
HCDec 7, 2018
Esports Athletes and Players: a Comparative StudyNikita Khromov, Alexander Korotin, Andrey Lange et al.
We present a comparative study of the players' and professional players' (athletes') performance in Counter Strike: Global Offensive (CS:GO) discipline. Our study is based on ubiquitous sensing helping identify the biometric features significantly contributing to the classification of particular skills of the players. The research provides better understanding why the athletes demonstrate superior performance as compared to other players.