Can Han

CV
h-index18
8papers
9citations
Novelty44%
AI Score54

8 Papers

CVDec 2, 2025Code
MICCAI STSR 2025 Challenge: Semi-Supervised Teeth and Pulp Segmentation and CBCT-IOS Registration

Yaqi Wang, Zhi Li, Chengyu Wu et al.

Cone-Beam Computed Tomography (CBCT) and Intraoral Scanning (IOS) are essential for digital dentistry, but annotated data scarcity limits automated solutions for pulp canal segmentation and cross-modal registration. To benchmark semi-supervised learning (SSL) in this domain, we organized the STSR 2025 Challenge at MICCAI 2025, featuring two tasks: (1) semi-supervised segmentation of teeth and pulp canals in CBCT, and (2) semi-supervised rigid registration of CBCT and IOS. We provided 60 labeled and 640 unlabeled IOS samples, plus 30 labeled and 250 unlabeled CBCT scans with varying resolutions and fields of view. The challenge attracted strong community participation, with top teams submitting open-source deep learning-based SSL solutions. For segmentation, leading methods used nnU-Net and Mamba-like State Space Models with pseudo-labeling and consistency regularization, achieving a Dice score of 0.967 and Instance Affinity of 0.738 on the hidden test set. For registration, effective approaches combined PointNetLK with differentiable SVD and geometric augmentation to handle modality gaps; hybrid neural-classical refinement enabled accurate alignment despite limited labels. All data and code are publicly available at https://github.com/ricoleehduu/STS-Challenge-2025 to ensure reproducibility.

12.3HCApr 15Code
EMGFlow: Robust and Efficient Surface Electromyography Synthesis via Flow Matching

Boxuan Jiang, Chenyun Dai, Can Han

Deep learning-based surface electromyography (sEMG) gesture recognition is frequently bottlenecked by data scarcity and limited subject diversity. While synthetic data generation via Generative Adversarial Networks (GANs) and diffusion models has emerged as a promising augmentation strategy, these approaches often face challenges regarding training stability or inference efficiency. To bridge this gap, we propose EMGFlow, a conditional sEMG generation framework. To the best of our knowledge, this is the first study to investigate the application of Flow Matching (FM) and continuous-time generative modeling in the sEMG domain. To validate EMGFlow across three benchmark sEMG datasets, we employ a unified evaluation protocol integrating feature-based fidelity, distributional geometry, and downstream utility. Extensive evaluations show that EMGFlow outperforms conventional augmentation and GAN baselines, and provides stronger standalone utility than the diffusion baselines considered here under the train-on-synthetic test-on-real (TSTR) protocol. Furthermore, by optimizing generation dynamics through advanced numerical solvers and targeted time sampling, EMGFlow achieves improved quality-efficiency trade-offs. Taken together, these results suggest that Flow Matching is a promising and efficient paradigm for addressing data bottlenecks in myoelectric control systems. Our code is available at: https://github.com/Open-EXG/EMGFlow.

HCJan 26Code
Fusion of Spatio-Temporal and Multi-Scale Frequency Features for Dry Electrodes MI-EEG Decoding

Tianyi Gong, Can Han, Junxi Wu et al.

Dry-electrode Motor Imagery Electroencephalography (MI-EEG) enables fast, comfortable, real-world Brain Computer Interface by eliminating gels and shortening setup for at-home and wearable use.However, dry recordings pose three main issues: lower Signal-to-Noise Ratio with more baseline drift and sudden transients; weaker and noisier data with poor phase alignment across trials; and bigger variances between sessions. These drawbacks lead to larger data distribution shift, making features less stable for MI-EEG tasks.To address these problems, we introduce STGMFM, a tri-branch framework tailored for dry-electrode MI-EEG, which models complementary spatio-temporal dependencies via dual graph orders, and captures robust envelope dynamics with a multi-scale frequency mixing branch, motivated by the observation that amplitude envelopes are less sensitive to contact variability than instantaneous waveforms. Physiologically meaningful connectivity priors guide learning, and decision-level fusion consolidates a noise-tolerant consensus. On our collected dry-electrode MI-EEG, STGMFM consistently surpasses competitive CNN/Transformer/graph baselines. Codes are available at https://github.com/Tianyi-325/STGMFM.

IVNov 28, 2025Code
MICCAI STS 2024 Challenge: Semi-Supervised Instance-Level Tooth Segmentation in Panoramic X-ray and CBCT Images

Yaqi Wang, Zhi Li, Chengyu Wu et al.

Orthopantomogram (OPGs) and Cone-Beam Computed Tomography (CBCT) are vital for dentistry, but creating large datasets for automated tooth segmentation is hindered by the labor-intensive process of manual instance-level annotation. This research aimed to benchmark and advance semi-supervised learning (SSL) as a solution for this data scarcity problem. We organized the 2nd Semi-supervised Teeth Segmentation (STS 2024) Challenge at MICCAI 2024. We provided a large-scale dataset comprising over 90,000 2D images and 3D axial slices, which includes 2,380 OPG images and 330 CBCT scans, all featuring detailed instance-level FDI annotations on part of the data. The challenge attracted 114 (OPG) and 106 (CBCT) registered teams. To ensure algorithmic excellence and full transparency, we rigorously evaluated the valid, open-source submissions from the top 10 (OPG) and top 5 (CBCT) teams, respectively. All successful submissions were deep learning-based SSL methods. The winning semi-supervised models demonstrated impressive performance gains over a fully-supervised nnU-Net baseline trained only on the labeled data. For the 2D OPG track, the top method improved the Instance Affinity (IA) score by over 44 percentage points. For the 3D CBCT track, the winning approach boosted the Instance Dice score by 61 percentage points. This challenge confirms the substantial benefit of SSL for complex, instance-level medical image segmentation tasks where labeled data is scarce. The most effective approaches consistently leveraged hybrid semi-supervised frameworks that combined knowledge from foundational models like SAM with multi-stage, coarse-to-fine refinement pipelines. Both the challenge dataset and the participants' submitted code have been made publicly available on GitHub (https://github.com/ricoleehduu/STS-Challenge-2024), ensuring transparency and reproducibility.

IVMar 2, 2025Code
Robust Real-Time Endoscopic Stereo Matching under Fuzzy Tissue Boundaries

Yang Ding, Can Han, Sijia Du et al.

Real-time acquisition of accurate scene depth is essential for automated robotic minimally invasive surgery. Stereo matching with binocular endoscopy can provide this depth information. However, existing stereo matching methods, designed primarily for natural images, often struggle with endoscopic images due to fuzzy tissue boundaries and typically fail to meet real-time requirements for high-resolution endoscopic image inputs. To address these challenges, we propose \textbf{RRESM}, a real-time stereo matching method tailored for endoscopic images. Our approach integrates a 3D Mamba Coordinate Attention module that enhances cost aggregation through position-sensitive attention maps and long-range spatial dependency modeling via the Mamba block, generating a robust cost volume without substantial computational overhead. Additionally, we introduce a High-Frequency Disparity Optimization module that refines disparity predictions near tissue boundaries by amplifying high-frequency details in the wavelet domain. Evaluations on the SCARED and SERV-CT datasets demonstrate state-of-the-art matching accuracy with a real-time inference speed of 42 FPS. The code is available at https://github.com/Sonne-Ding/RRESM.

CVJul 29, 2024
Towards Open-Set Myoelectric Gesture Recognition via Dual-Perspective Inconsistency Learning

Chen Liu, Can Han, Chengfeng Zhou et al.

Gesture recognition based on surface electromyography (sEMG) has achieved significant progress in human-machine interaction (HMI), especially in prosthetic control and movement rehabilitation. However, accurately recognizing predefined gestures within a closed set is still inadequate in practice; a robust open-set system needs to effectively reject unknown gestures while correctly classifying known ones, which is rarely explored in the field of myoelectric gesture recognition. To handle this challenge, we first report a significant distinction in prediction inconsistency discovered for unknown classes, which arises from different perspectives and can substantially enhance open-set recognition performance. Based on this insight, we propose a novel dual-perspective inconsistency learning approach, PredIN, to explicitly magnify the prediction inconsistency by enhancing the inconsistency of class feature distribution within different perspectives. Specifically, PredIN maximizes the class feature distribution inconsistency among the dual perspectives to enhance their differences. Meanwhile, it optimizes inter-class separability within an individual perspective to maintain individual performance. Comprehensive experiments on various benchmark datasets demonstrate that the PredIN outperforms state-of-the-art methods by a clear margin. Our proposed method simultaneously achieves accurate closed-set classification for predefined gestures and effective rejection for unknown gestures, exhibiting its efficacy and superiority in open-set gesture recognition based on sEMG.

CVNov 11, 2025
SASG-DA: Sparse-Aware Semantic-Guided Diffusion Augmentation For Myoelectric Gesture Recognition

Chen Liu, Can Han, Weishi Xu et al.

Surface electromyography (sEMG)-based gesture recognition plays a critical role in human-machine interaction (HMI), particularly for rehabilitation and prosthetic control. However, sEMG-based systems often suffer from the scarcity of informative training data, leading to overfitting and poor generalization in deep learning models. Data augmentation offers a promising approach to increasing the size and diversity of training data, where faithfulness and diversity are two critical factors to effectiveness. However, promoting untargeted diversity can result in redundant samples with limited utility. To address these challenges, we propose a novel diffusion-based data augmentation approach, Sparse-Aware Semantic-Guided Diffusion Augmentation (SASG-DA). To enhance generation faithfulness, we introduce the Semantic Representation Guidance (SRG) mechanism by leveraging fine-grained, task-aware semantic representations as generation conditions. To enable flexible and diverse sample generation, we propose a Gaussian Modeling Semantic Sampling (GMSS) strategy, which models the semantic representation distribution and allows stochastic sampling to produce both faithful and diverse samples. To enhance targeted diversity, we further introduce a Sparse-Aware Semantic Sampling (SASS) strategy to explicitly explore underrepresented regions, improving distribution coverage and sample utility. Extensive experiments on benchmark sEMG datasets, Ninapro DB2, DB4, and DB7, demonstrate that SASG-DA significantly outperforms existing augmentation methods. Overall, our proposed data augmentation approach effectively mitigates overfitting and improves recognition performance and generalization by offering both faithful and diverse samples.

CVDec 5, 2023
Towards Open-set Gesture Recognition via Feature Activation Enhancement and Orthogonal Prototype Learning

Chen Liu, Can Han, Chengfeng Zhou et al.

Gesture recognition is a foundational task in human-machine interaction (HMI). While there has been significant progress in gesture recognition based on surface electromyography (sEMG), accurate recognition of predefined gestures only within a closed set is still inadequate in practice. It is essential to effectively discern and reject unknown gestures of disinterest in a robust system. Numerous methods based on prototype learning (PL) have been proposed to tackle this open set recognition (OSR) problem. However, they do not fully explore the inherent distinctions between known and unknown classes. In this paper, we propose a more effective PL method leveraging two novel and inherent distinctions, feature activation level and projection inconsistency. Specifically, the Feature Activation Enhancement Mechanism (FAEM) widens the gap in feature activation values between known and unknown classes. Furthermore, we introduce Orthogonal Prototype Learning (OPL) to construct multiple perspectives. OPL acts to project a sample from orthogonal directions to maximize the distinction between its two projections, where unknown samples will be projected near the clusters of different known classes while known samples still maintain intra-class similarity. Our proposed method simultaneously achieves accurate closed-set classification for predefined gestures and effective rejection for unknown gestures. Extensive experiments demonstrate its efficacy and superiority in open-set gesture recognition based on sEMG.