Visualization of Unstructured Sports Data -- An Example of Cricket Short Text Commentary
This work addresses the underutilization of unstructured data in sports visualization for cricket analysts and fans, though it is incremental as it applies existing visualization concepts to a new data source.
The authors tackled the problem of visualizing unstructured sports data by using cricket short text commentary to construct and visualize individual players' strength and weakness rules, analyzing over one million text commentaries and validating the rules through two methods.
Sports visualization focuses on the use of structured data, such as box-score data and tracking data. Unstructured data sources pertaining to sports are available in various places such as blogs, social media posts, and online news articles. Sports visualization methods either not fully exploited the information present in these sources or the proposed visualizations through the use of these sources did not augment to the body of sports visualization methods. We propose the use of unstructured data, namely cricket short text commentary for visualization. The short text commentary data is used for constructing individual player's strength rules and weakness rules. A computationally feasible definition for player's strength rule and weakness rule is proposed. A visualization method for the constructed rules is presented. In addition, players having similar strength rules or weakness rules is computed and visualized. We demonstrate the usefulness of short text commentary in visualization by analyzing the strengths and weaknesses of cricket players using more than one million text commentaries. We validate the constructed rules through two validation methods. The collected data, source code, and obtained results on more than 500 players are made publicly available.