CVApr 22, 2024

CoFInAl: Enhancing Action Quality Assessment with Coarse-to-Fine Instruction Alignment

Kanglei Zhou, Junlin Li, Ruizhi Cai, Liyuan Wang, Xingxing Zhang, Xiaohui Liang

arXiv:2404.13999v127 citationsh-index: 8Has CodeIJCAI

Originality Incremental advance

AI Analysis

This addresses the challenge of quantifying actions in domains like sports and medical care, offering an incremental improvement over existing methods.

The paper tackled the problem of Action Quality Assessment (AQA) by proposing CoFInAl, a method that reformulates AQA as a coarse-to-fine classification task, achieving state-of-the-art performance with correlation gains of 5.49% and 3.55% on two datasets.

Action Quality Assessment (AQA) is pivotal for quantifying actions across domains like sports and medical care. Existing methods often rely on pre-trained backbones from large-scale action recognition datasets to boost performance on smaller AQA datasets. However, this common strategy yields suboptimal results due to the inherent struggle of these backbones to capture the subtle cues essential for AQA. Moreover, fine-tuning on smaller datasets risks overfitting. To address these issues, we propose Coarse-to-Fine Instruction Alignment (CoFInAl). Inspired by recent advances in large language model tuning, CoFInAl aligns AQA with broader pre-trained tasks by reformulating it as a coarse-to-fine classification task. Initially, it learns grade prototypes for coarse assessment and then utilizes fixed sub-grade prototypes for fine-grained assessment. This hierarchical approach mirrors the judging process, enhancing interpretability within the AQA framework. Experimental results on two long-term AQA datasets demonstrate CoFInAl achieves state-of-the-art performance with significant correlation gains of 5.49% and 3.55% on Rhythmic Gymnastics and Fis-V, respectively. Our code is available at https://github.com/ZhouKanglei/CoFInAl_AQA.

View on arXiv PDF Code

Similar