CLOct 21, 2019

Trouble with the Curve: Predicting Future MLB Players Using Scouting Reports

arXiv:1910.12622v1Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses player evaluation for baseball teams and analysts, but it is incremental as it applies existing methods to a new dataset.

The authors tackled the problem of predicting whether minor league baseball players will reach the MLB using scouting reports, by creating a dataset of nearly 10,000 reports and applying deep neural networks, though no concrete prediction results or numbers are provided.

In baseball, a scouting report profiles a player's characteristics and traits, usually intended for use in player valuation. This work presents a first-of-its-kind dataset of almost 10,000 scouting reports for minor league, international, and draft prospects. Compiled from articles posted to MLB.com and Fangraphs.com, each report consists of a written description of the player, numerical grades for several skills, and unique IDs to reference their profiles on popular resources like MLB.com, FanGraphs, and Baseball-Reference. With this dataset, we employ several deep neural networks to predict if minor league players will make the MLB given their scouting report. We open-source this data to share with the community, and present a web application demonstrating language variations in the reports of successful and unsuccessful prospects.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes