Nuclear penalized multinomial regression with an application to predicting at bat outcomes in baseball
This work addresses prediction accuracy in sports analytics, specifically for baseball outcomes, but is incremental as it adapts an existing regularization method to a multinomial context.
The authors tackled the problem of predicting at-bat outcomes in baseball by proposing nuclear penalized multinomial regression (NPMR) as an alternative to ridge penalty, leveraging structure among response categories to improve predictions, with results interpreted to align with expertise and offer new insights into player differentiation.
We propose the nuclear norm penalty as an alternative to the ridge penalty for regularized multinomial regression. This convex relaxation of reduced-rank multinomial regression has the advantage of leveraging underlying structure among the response categories to make better predictions. We apply our method, nuclear penalized multinomial regression (NPMR), to Major League Baseball play-by-play data to predict outcome probabilities based on batter-pitcher matchups. The interpretation of the results meshes well with subject-area expertise and also suggests a novel understanding of what differentiates players.