CVDec 4, 2025
WiFi-based Cross-Domain Gesture Recognition Using Attention MechanismRuijing Liu, Cunhua Pan, Jiaming Zeng et al.
While fulfilling communication tasks, wireless signals can also be used to sense the environment. Among various types of sensing media, WiFi signals offer advantages such as widespread availability, low hardware cost, and strong robustness to environmental conditions like light, temperature, and humidity. By analyzing Wi-Fi signals in the environment, it is possible to capture dynamic changes of the human body and accomplish sensing applications such as gesture recognition. Although many existing gesture sensing solutions perform well in-domain but lack cross-domain capabilities (i.e., recognition performance in untrained environments). To address this, we extract Doppler spectra from the channel state information (CSI) received by all receivers and concatenate each Doppler spectrum along the same time axis to generate fused images with multi-angle information as input features. Furthermore, inspired by the convolutional block attention module (CBAM), we propose a gesture recognition network that integrates a multi-semantic spatial attention mechanism with a self-attention-based channel mechanism. This network constructs attention maps to quantify the spatiotemporal features of gestures in images, enabling the extraction of key domain-independent features. Additionally, ResNet18 is employed as the backbone network to further capture deep-level features. To validate the network performance, we evaluate the proposed network on the public Widar3 dataset, and the results show that it not only maintains high in-domain accuracy of 99.72%, but also achieves high performance in cross-domain recognition of 97.61%, significantly outperforming existing best solutions.
40.4SPApr 23
Robust Cross-Domain WiFi Fall Detection via Physics-Driven Attention-Enhanced TransformersYingzhe Wang, Cunhua Pan, Ruijing Liu et al.
Device-free fall detection utilizing WiFi Channel State Information (CSI) has emerged as a promising, privacy-preserving solution for elderly health monitoring in the Internet of Things (IoT) era. However, existing deep learning approaches suffer from severe performance degradation when deployed in unseen environments due to static background overfitting and Non-Line-of-Sight (NLoS) signal attenuation. To address these critical bottlenecks, we propose a robust, domain-generalizable framework featuring a novel Attention-Enhanced CNN-Transformer hybrid architecture. First, we design a physics-driven \textbf{Dynamic Variance Gate (DVG)} to dynamically calculate local temporal variance, acting as a soft-attention mask that eliminates static environmental DC components while amplifying dynamic human motion. Second, we introduce a Physics-Aware Data Augmentation strategy to force the network to learn invariant morphological signatures rather than environment-specific noise. Furthermore, a Convolutional Block Attention Module (CBAM) is integrated to refine spatiotemporal features prior to Transformer-based sequence modeling. Extensive cross-domain evaluations across four distinct indoor environments demonstrate that our method achieves 97.6\% accuracy in NLoS scenarios and 98.8\% in completely unseen environments without target-domain fine-tuning. Finally, we deploy the proposed framework on an edge computing system equipped with commercial WiFi NICs. Real-world live inference field tests confirm the system's robustness against unseen environmental layouts and its capability for continuous, low-latency whole-home safety monitoring.