Generalizing Sports Feedback Generation by Watching Competitions and Reading Books: A Rock Climbing Case Study
This addresses the problem of generating practical sports feedback with limited annotations for AI systems, but it is incremental as it builds on existing video-LLM methods with domain-specific adaptations.
The paper tackles the challenge of sports feedback generation by video-LLMs, which struggle with generalization to unseen sports and rely on expensive finetuning data, proposing to use auxiliary web data like competition videos and coaching manuals for rock climbing to improve performance, and introduces specificity and actionability metrics for better evaluation.
While there is rapid progress in video-LLMs with advanced reasoning capabilities, prior work shows that these models struggle on the challenging task of sports feedback generation and require expensive and difficult-to-collect finetuning feedback data for each sport. This limitation is evident from the poor generalization to sports unseen during finetuning. Furthermore, traditional text generation evaluation metrics (e.g., BLEU-4, METEOR, ROUGE-L, BERTScore), originally developed for machine translation and summarization, fail to capture the unique aspects of sports feedback quality. To address the first problem, using rock climbing as our case study, we propose using auxiliary freely-available web data from the target domain, such as competition videos and coaching manuals, in addition to existing sports feedback from a disjoint, source domain to improve sports feedback generation performance on the target domain. To improve evaluation, we propose two evaluation metrics: (1) specificity and (2) actionability. Together, our approach enables more meaningful and practical generation of sports feedback under limited annotations.