Cross-individual generalizability of machine learning models for ball speed prediction in baseball pitching
For sports scientists and practitioners, this work highlights the critical gap in cross-individual model performance, but the findings are incremental as they confirm known limitations of ML generalizability in a specific domain.
This study evaluated cross-individual generalizability of ML models for predicting ball speed in baseball pitching, finding that predictive performance dropped from R²=0.91 (within-individual) to R²=0.38 (cross-individual), with overestimation for intermediate pitchers and best generalizability from trunk and pivot leg features.
Although machine learning (ML)-based performance outcome prediction is an important topic in contemporary sports science, one important issue is the limited understanding of the cross-individual generalizability of ML models in sports contexts. To address this issue, this study aimed to evaluate the cross-individual generalizability of ML models for predicting ball speed in baseball pitching. A dataset comprising 50 pitchers from various competitive levels was analyzed. Cross-individual generalizability was assessed using leave-one-subject-out cross-validation. Specifically, the effects of expertise level and restrictions on spatiotemporal motion information were examined to identify factors influencing model generalizability. The results revealed that, under cross-individual evaluation, (1) predictive performance was markedly lower than under within-individual evaluation, with R-squared value decreasing from 0.91 to 0.38; (2) the model tended to overestimate the performance of Intermediate pitchers relative to Expert pitchers, with a significant group difference in signed prediction error (p < .05); and (3) the trunk and pivot leg demonstrated relatively high generalization performance, with the pivot leg showing notable generalizability even during the weight-shift initiation phase (R-squared value > 0.25). These findings underscore the importance of cross-individual evaluation in enhancing the practical applicability of ML in sports settings and contribute to a deeper understanding of the biomechanical factors underlying the target movement.