PATS: Proficiency-Aware Temporal Sampling for Multi-View Sports Skill Assessment
This work addresses the challenge of maintaining temporal continuity in video-based skill assessment for sports and other activities, offering an incremental improvement over existing sampling methods.
The paper tackles the problem of automated sports skill assessment by introducing PATS, a proficiency-aware temporal sampling strategy that preserves complete fundamental movements in video segments, achieving state-of-the-art accuracy improvements of up to +3.05% on the EgoExo4D benchmark and substantial gains in domains like bouldering (+26.22%).
Automated sports skill assessment requires capturing fundamental movement patterns that distinguish expert from novice performance, yet current video sampling methods disrupt the temporal continuity essential for proficiency evaluation. To this end, we introduce Proficiency-Aware Temporal Sampling (PATS), a novel sampling strategy that preserves complete fundamental movements within continuous temporal segments for multi-view skill assessment. PATS adaptively segments videos to ensure each analyzed portion contains full execution of critical performance components, repeating this process across multiple segments to maximize information coverage while maintaining temporal coherence. Evaluated on the EgoExo4D benchmark with SkillFormer, PATS surpasses the state-of-the-art accuracy across all viewing configurations (+0.65% to +3.05%) and delivers substantial gains in challenging domains (+26.22% bouldering, +2.39% music, +1.13% basketball). Systematic analysis reveals that PATS successfully adapts to diverse activity characteristics-from high-frequency sampling for dynamic sports to fine-grained segmentation for sequential skills-demonstrating its effectiveness as an adaptive approach to temporal sampling that advances automated skill assessment for real-world applications. Visit our project page at https://edowhite.github.io/PATS