Crystal Cai

CV
h-index12
3papers
14citations
Novelty52%
AI Score26

3 Papers

CVMar 28, 2023
Colo-SCRL: Self-Supervised Contrastive Representation Learning for Colonoscopic Video Retrieval

Qingzhong Chen, Shilun Cai, Crystal Cai et al.

Colonoscopic video retrieval, which is a critical part of polyp treatment, has great clinical significance for the prevention and treatment of colorectal cancer. However, retrieval models trained on action recognition datasets usually produce unsatisfactory retrieval results on colonoscopic datasets due to the large domain gap between them. To seek a solution to this problem, we construct a large-scale colonoscopic dataset named Colo-Pair for medical practice. Based on this dataset, a simple yet effective training method called Colo-SCRL is proposed for more robust representation learning. It aims to refine general knowledge from colonoscopies through masked autoencoder-based reconstruction and momentum contrast to improve retrieval performance. To the best of our knowledge, this is the first attempt to employ the contrastive learning paradigm for medical video retrieval. Empirical results show that our method significantly outperforms current state-of-the-art methods in the colonoscopic video retrieval task.

CVJul 29, 2024
Towards Open-Set Myoelectric Gesture Recognition via Dual-Perspective Inconsistency Learning

Chen Liu, Can Han, Chengfeng Zhou et al.

Gesture recognition based on surface electromyography (sEMG) has achieved significant progress in human-machine interaction (HMI), especially in prosthetic control and movement rehabilitation. However, accurately recognizing predefined gestures within a closed set is still inadequate in practice; a robust open-set system needs to effectively reject unknown gestures while correctly classifying known ones, which is rarely explored in the field of myoelectric gesture recognition. To handle this challenge, we first report a significant distinction in prediction inconsistency discovered for unknown classes, which arises from different perspectives and can substantially enhance open-set recognition performance. Based on this insight, we propose a novel dual-perspective inconsistency learning approach, PredIN, to explicitly magnify the prediction inconsistency by enhancing the inconsistency of class feature distribution within different perspectives. Specifically, PredIN maximizes the class feature distribution inconsistency among the dual perspectives to enhance their differences. Meanwhile, it optimizes inter-class separability within an individual perspective to maintain individual performance. Comprehensive experiments on various benchmark datasets demonstrate that the PredIN outperforms state-of-the-art methods by a clear margin. Our proposed method simultaneously achieves accurate closed-set classification for predefined gestures and effective rejection for unknown gestures, exhibiting its efficacy and superiority in open-set gesture recognition based on sEMG.

CVDec 5, 2023
Towards Open-set Gesture Recognition via Feature Activation Enhancement and Orthogonal Prototype Learning

Chen Liu, Can Han, Chengfeng Zhou et al.

Gesture recognition is a foundational task in human-machine interaction (HMI). While there has been significant progress in gesture recognition based on surface electromyography (sEMG), accurate recognition of predefined gestures only within a closed set is still inadequate in practice. It is essential to effectively discern and reject unknown gestures of disinterest in a robust system. Numerous methods based on prototype learning (PL) have been proposed to tackle this open set recognition (OSR) problem. However, they do not fully explore the inherent distinctions between known and unknown classes. In this paper, we propose a more effective PL method leveraging two novel and inherent distinctions, feature activation level and projection inconsistency. Specifically, the Feature Activation Enhancement Mechanism (FAEM) widens the gap in feature activation values between known and unknown classes. Furthermore, we introduce Orthogonal Prototype Learning (OPL) to construct multiple perspectives. OPL acts to project a sample from orthogonal directions to maximize the distinction between its two projections, where unknown samples will be projected near the clusters of different known classes while known samples still maintain intra-class similarity. Our proposed method simultaneously achieves accurate closed-set classification for predefined gestures and effective rejection for unknown gestures. Extensive experiments demonstrate its efficacy and superiority in open-set gesture recognition based on sEMG.