Contrastive Learning for Sports Video: Unsupervised Player Classification
This addresses the problem of real-time team classification in sports analytics for applications like player tracking, but it is incremental as it builds on existing contrastive learning techniques.
The paper tackles unsupervised classification of players by team in sports video without prior knowledge of jersey colors, using contrastive learning to maximize distance between different teams' players. It achieves 94% accuracy after training on a single frame and 97% within 500 frames, outperforming prior methods.
We address the problem of unsupervised classification of players in a team sport according to their team affiliation, when jersey colours and design are not known a priori. We adopt a contrastive learning approach in which an embedding network learns to maximize the distance between representations of players on different teams relative to players on the same team, in a purely unsupervised fashion, without any labelled data. We evaluate the approach using a new hockey dataset and find that it outperforms prior unsupervised approaches by a substantial margin, particularly for real-time application when only a small number of frames are available for unsupervised learning before team assignments must be made. Remarkably, we show that our contrastive method achieves 94% accuracy after unsupervised training on only a single frame, with accuracy rising to 97% within 500 frames (17 seconds of game time). We further demonstrate how accurate team classification allows accurate team-conditional heat maps of player positioning to be computed.