SPJul 25, 2024
GesturePrint: Enabling User Identification for mmWave-based Gesture Recognition SystemsLilin Xu, Keyi Wang, Chaojie Gu et al.
The millimeter-wave (mmWave) radar has been exploited for gesture recognition. However, existing mmWave-based gesture recognition methods cannot identify different users, which is important for ubiquitous gesture interaction in many applications. In this paper, we propose GesturePrint, which is the first to achieve gesture recognition and gesture-based user identification using a commodity mmWave radar sensor. GesturePrint features an effective pipeline that enables the gesture recognition system to identify users at a minor additional cost. By introducing an efficient signal preprocessing stage and a network architecture GesIDNet, which employs an attention-based multilevel feature fusion mechanism, GesturePrint effectively extracts unique gesture features for gesture recognition and personalized motion pattern features for user identification. We implement GesturePrint and collect data from 17 participants performing 15 gestures in a meeting room and an office, respectively. GesturePrint achieves a gesture recognition accuracy (GRA) of 98.87% with a user identification accuracy (UIA) of 99.78% in the meeting room, and 98.22% GRA with 99.26% UIA in the office. Extensive experiments on three public datasets and a new gesture dataset show GesturePrint's superior performance in enabling effective user identification for gesture recognition systems.
IVJan 7
Staged Voxel-Level Deep Reinforcement Learning for 3D Medical Image Segmentation with Noisy AnnotationsYuyang Fu, Xiuzhen Guo, Ji Shi
Deep learning has achieved significant advancements in medical image segmentation. Currently, obtaining accurate segmentation outcomes is critically reliant on large-scale datasets with high-quality annotations. However, noisy annotations are frequently encountered owing to the complex morphological structures of organs in medical images and variations among different annotators, which can substantially limit the efficacy of segmentation models. Motivated by the fact that medical imaging annotator can correct labeling errors during segmentation based on prior knowledge, we propose an end-to-end Staged Voxel-Level Deep Reinforcement Learning (SVL-DRL) framework for robust medical image segmentation under noisy annotations. This framework employs a dynamic iterative update strategy to automatically mitigate the impact of erroneous labels without requiring manual intervention. The key advancements of SVL-DRL over existing works include: i) formulating noisy annotations as a voxel-dependent problem and addressing it through a novel staged reinforcement learning framework which guarantees robust model convergence; ii) incorporating a voxel-level asynchronous advantage actor-critic (vA3C) module that conceptualizes each voxel as an autonomous agent, which allows each agent to dynamically refine its own state representation during training, thereby directly mitigating the influence of erroneous labels; iii) designing a novel action space for the agents, along with a composite reward function that strategically combines the Dice value and a spatial continuity metric to significantly boost segmentation accuracy while maintain semantic integrity. Experiments on three public medical image datasets demonstrates State-of-The-Art (SoTA) performance under various experimental settings, with an average improvement of over 3\% in both Dice and IoU scores.
LGSep 20, 2025
mmExpert: Integrating Large Language Models for Comprehensive mmWave Data Synthesis and UnderstandingYifan Yan, Shuai Yang, Xiuzhen Guo et al.
Millimeter-wave (mmWave) sensing technology holds significant value in human-centric applications, yet the high costs associated with data acquisition and annotation limit its widespread adoption in our daily lives. Concurrently, the rapid evolution of large language models (LLMs) has opened up opportunities for addressing complex human needs. This paper presents mmExpert, an innovative mmWave understanding framework consisting of a data generation flywheel that leverages LLMs to automate the generation of synthetic mmWave radar datasets for specific application scenarios, thereby training models capable of zero-shot generalization in real-world environments. Extensive experiments demonstrate that the data synthesized by mmExpert significantly enhances the performance of downstream models and facilitates the successful deployment of large models for mmWave understanding.
CVMar 12, 2025
Diff-CL: A Novel Cross Pseudo-Supervision Method for Semi-supervised Medical Image SegmentationXiuzhen Guo, Lianyuan Yu, Ji Shi et al.
Semi-supervised learning utilizes insights from unlabeled data to improve model generalization, thereby reducing reliance on large labeled datasets. Most existing studies focus on limited samples and fail to capture the overall data distribution. We contend that combining distributional information with detailed information is crucial for achieving more robust and accurate segmentation results. On the one hand, with its robust generative capabilities, diffusion models (DM) learn data distribution effectively. However, it struggles with fine detail capture, leading to generated images with misleading details. Combining DM with convolutional neural networks (CNNs) enables the former to learn data distribution while the latter corrects fine details. While capturing complete high-frequency details by CNNs requires substantial computational resources and is susceptible to local noise. On the other hand, given that both labeled and unlabeled data come from the same distribution, we believe that regions in unlabeled data similar to overall class semantics to labeled data are likely to belong to the same class, while regions with minimal similarity are less likely to. This work introduces a semi-supervised medical image segmentation framework from the distribution perspective (Diff-CL). Firstly, we propose a cross-pseudo-supervision learning mechanism between diffusion and convolution segmentation networks. Secondly, we design a high-frequency mamba module to capture boundary and detail information globally. Finally, we apply contrastive learning for label propagation from labeled to unlabeled data. Our method achieves state-of-the-art (SOTA) performance across three datasets, including left atrium, brain tumor, and NIH pancreas datasets.
CVMar 12, 2025
GIGP: A Global Information Interacting and Geometric Priors Focusing Framework for Semi-supervised Medical Image SegmentationLianyuan Yu, Xiuzhen Guo, Ji Shi et al.
Semi-supervised learning enhances medical image segmentation by leveraging unlabeled data, reducing reliance on extensive labeled datasets. On the one hand, the distribution discrepancy between limited labeled data and abundant unlabeled data can hinder model generalization. Most existing methods rely on local similarity matching, which may introduce bias. In contrast, Mamba effectively models global context with linear complexity, learning more comprehensive data representations. On the other hand, medical images usually exhibit consistent anatomical structures defined by geometric features. Most existing methods fail to fully utilize global geometric priors, such as volumes, moments etc. In this work, we introduce a global information interaction and geometric priors focus framework (GIGP). Firstly, we present a Global Information Interaction Mamba module to reduce distribution discrepancy between labeled and unlabeled data. Secondly, we propose a Geometric Moment Attention Mechanism to extract richer global geometric features. Finally, we propose Global Geometric Perturbation Consistency to simulate organ dynamics and geometric variations, enhancing the ability of the model to learn generalized features. The superior performance on the NIH Pancreas and Left Atrium datasets demonstrates the effectiveness of our approach.