CVApr 20, 2025

Talk is Not Always Cheap: Promoting Wireless Sensing Models with Text Prompts

arXiv:2504.14621v21 citationsh-index: 4Has Code
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in wireless sensing for applications like public security and healthcare by enhancing model performance with text prompts, representing an incremental but practical advancement.

The paper tackles the problem of wireless sensing models not utilizing textual information in datasets by proposing WiTalk, a text-enhanced framework that integrates semantic knowledge through prompt strategies, resulting in accuracy improvements of up to 13.68% on benchmark datasets for human action recognition and localization.

Wireless signal-based human sensing technologies, such as WiFi, millimeter-wave (mmWave) radar, and Radio Frequency Identification (RFID), enable the detection and interpretation of human presence, posture, and activities, thereby providing critical support for applications in public security, healthcare, and smart environments. These technologies exhibit notable advantages due to their non-contact operation and environmental adaptability; however, existing systems often fail to leverage the textual information inherent in datasets. To address this, we propose an innovative text-enhanced wireless sensing framework, WiTalk, that seamlessly integrates semantic knowledge through three hierarchical prompt strategies-label-only, brief description, and detailed action description-without requiring architectural modifications or incurring additional data costs. We rigorously validate this framework across three public benchmark datasets: XRF55 for human action recognition (HAR), and WiFiTAL and XRFV2 for WiFi temporal action localization (TAL). Experimental results demonstrate significant performance improvements: on XRF55, accuracy for WiFi, RFID, and mmWave increases by 3.9%, 2.59%, and 0.46%, respectively; on WiFiTAL, the average performance of WiFiTAD improves by 4.98%; and on XRFV2, the mean average precision gains across various methods range from 4.02% to 13.68%. Our codes have been included in https://github.com/yangzhenkui/WiTalk.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes