CoachMe: Decoding Sport Elements with a Reference-Based Coaching Instruction Generation Model
This addresses the problem of providing informative coaching feedback for athletes in specific sports like skating and boxing, though it is incremental as it builds on existing multimodal models with a novel reference-based approach.
The paper tackles the challenge of generating precise, sport-specific motion instructions by proposing CoachMe, a reference-based model that analyzes differences between learner and reference motions, resulting in performance improvements of 31.6% over GPT-4o in figure skating and 58.3% in boxing on G-Eval.
Motion instruction is a crucial task that helps athletes refine their technique by analyzing movements and providing corrective guidance. Although recent advances in multimodal models have improved motion understanding, generating precise and sport-specific instruction remains challenging due to the highly domain-specific nature of sports and the need for informative guidance. We propose CoachMe, a reference-based model that analyzes the differences between a learner's motion and a reference under temporal and physical aspects. This approach enables both domain-knowledge learning and the acquisition of a coach-like thinking process that identifies movement errors effectively and provides feedback to explain how to improve. In this paper, we illustrate how CoachMe adapts well to specific sports such as skating and boxing by learning from general movements and then leveraging limited data. Experiments show that CoachMe provides high-quality instructions instead of directions merely in the tone of a coach but without critical information. CoachMe outperforms GPT-4o by 31.6% in G-Eval on figure skating and by 58.3% on boxing. Analysis further confirms that it elaborates on errors and their corresponding improvement methods in the generated instructions. You can find CoachMe here: https://motionxperts.github.io/