Simulating Tracking Data to Advance Sports Analytics Research
This addresses the problem of limited data access for researchers in sports analytics, particularly for continuous sports, though it is incremental as it simulates rather than collects real data.
The paper tackles the scarcity of high-resolution tracking data in continuous invasion sports like soccer by presenting a method to collect and utilize simulated tracking data from the Google Research Football environment, storing it in a representative schema and extracting features and events to support model development.
Advanced analytics have transformed how sports teams operate, particularly in episodic sports like baseball. Their impact on continuous invasion sports, such as soccer and ice hockey, has been limited due to increased game complexity and restricted access to high-resolution game tracking data. In this demo, we present a method to collect and utilize simulated soccer tracking data from the Google Research Football environment to support the development of models designed for continuous tracking data. The data is stored in a schema that is representative of real tracking data and we provide processes that extract high-level features and events. We include examples of established tracking data models to showcase the efficacy of the simulated data. We address the scarcity of publicly available tracking data, providing support for research at the intersection of artificial intelligence and sports analytics.