From Detection to Anticipation: Online Understanding of Struggles across Various Tasks and Activities
This work addresses the need for real-time assistive systems to detect and anticipate user struggles across various tasks, though it is incremental as it adapts existing models.
The paper tackled the problem of real-time struggle recognition for intelligent assistive systems by reformulating it as an online detection and anticipation task, achieving 70-80% mAP for detection and comparable performance for anticipation up to 2 seconds ahead, with models running at up to 143 FPS for real-time use.
Understanding human skill performance is essential for intelligent assistive systems, with struggle recognition offering a natural cue for identifying user difficulties. While prior work focuses on offline struggle classification and localization, real-time applications require models capable of detecting and anticipating struggle online. We reformulate struggle localization as an online detection task and further extend it to anticipation, predicting struggle moments before they occur. We adapt two off-the-shelf models as baselines for online struggle detection and anticipation. Online struggle detection achieves 70-80% per-frame mAP, while struggle anticipation up to 2 seconds ahead yields comparable performance with slight drops. We further examine generalization across tasks and activities and analyse the impact of skill evolution. Despite larger domain gaps in activity-level generalization, models still outperform random baselines by 4-20%. Our feature-based models run at up to 143 FPS, and the whole pipeline, including feature extraction, operates at around 20 FPS, sufficient for real-time assistive applications.