CVJul 8, 2025

CuriosAI Submission to the EgoExo4D Proficiency Estimation Challenge 2025

arXiv:2507.08022v1h-index: 2
AI Analysis

This work addresses proficiency estimation in computer vision for skill assessment, but it is incremental as it builds on existing models like Sapiens-2B and VideoMAE for a specific challenge.

The paper tackled the problem of multi-view skill assessment in the EgoExo4D Proficiency Estimation Challenge by proposing two methods, with the two-stage pipeline achieving 47.8% accuracy, demonstrating the effectiveness of scenario-conditioned modeling.

This report presents the CuriosAI team's submission to the EgoExo4D Proficiency Estimation Challenge at CVPR 2025. We propose two methods for multi-view skill assessment: (1) a multi-task learning framework using Sapiens-2B that jointly predicts proficiency and scenario labels (43.6 % accuracy), and (2) a two-stage pipeline combining zero-shot scenario recognition with view-specific VideoMAE classifiers (47.8 % accuracy). The superior performance of the two-stage approach demonstrates the effectiveness of scenario-conditioned modeling for proficiency estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes