HRTR: A Single-stage Transformer for Fine-grained Sub-second Action Segmentation in Stroke Rehabilitation
This addresses the need for precise movement tracking in stroke rehabilitation to monitor patient progress, representing a domain-specific advancement.
The paper tackles the problem of fine-grained, sub-second action segmentation in stroke rehabilitation by proposing HRTR, a single-stage transformer that eliminates multi-stage methods and post-processing, achieving Edit Scores of 70.1 on StrokeRehab Video, 69.4 on StrokeRehab IMU, and 88.4 on 50Salads.
Stroke rehabilitation often demands precise tracking of patient movements to monitor progress, with complexities of rehabilitation exercises presenting two critical challenges: fine-grained and sub-second (under one-second) action detection. In this work, we propose the High Resolution Temporal Transformer (HRTR), to time-localize and classify high-resolution (fine-grained), sub-second actions in a single-stage transformer, eliminating the need for multi-stage methods and post-processing. Without any refinements, HRTR outperforms state-of-the-art systems on both stroke related and general datasets, achieving Edit Score (ES) of 70.1 on StrokeRehab Video, 69.4 on StrokeRehab IMU, and 88.4 on 50Salads.