CVMar 17

Edge-Efficient Two-Stream Multimodal Architecture for Non-Intrusive Bathroom Fall Detection

Haitian Wang, Yiren Wang, Xinyu Wang, Sheldon Fung, Atif Mansoor

arXiv:2603.1706951.5h-index: 4

Predicted impact top 67% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This addresses safety risks for seniors living alone with a privacy-preserving, non-intrusive fall detection system, though it is incremental over existing multimodal approaches.

The paper tackles the problem of detecting falls in bathrooms for seniors by proposing a two-stream multimodal architecture that processes radar and vibration signals, achieving 96.1% accuracy and reducing latency and energy consumption on edge devices.

Falls in wet bathroom environments are a major safety risk for seniors living alone. Recent work has shown that mmWave-only, vibration-only, and existing multimodal schemes, such as vibration-triggered radar activation, early feature concatenation, and decision-level score fusion, can support privacy-preserving, non-intrusive fall detection. However, these designs still treat motion and impact as loosely coupled streams, depending on coarse temporal alignment and amplitude thresholds, and do not explicitly encode the causal link between radar-observed collapse and floor impact or address timing drift, object drop confounders, and latency and energy constraints on low-power edge devices. To this end, we propose a two-stream architecture that encodes radar signals with a Motion--Mamba branch for long-range motion patterns and processes floor vibration with an Impact--Griffin branch that emphasizes impact transients and cross-axis coupling. Cross-conditioned fusion uses low-rank bilinear interaction and a Switch--MoE head to align motion and impact tokens and suppress object-drop confounders. The model keeps inference cost suitable for real-time execution on a Raspberry Pi 4B gateway. We construct a bathroom fall detection benchmark dataset with frame-level annotations, comprising more than 3~h of synchronized mmWave radar and triaxial vibration recordings across eight scenarios under running water, together with subject-independent training, validation, and test splits. On the test split, our model attains 96.1% accuracy, 94.8% precision, 88.0% recall, a 91.1% macro F1 score, and an AUC of 0.968. Compared with the strongest baseline, it improves accuracy by 2.0 percentage points and fall recall by 1.3 percentage points, while reducing latency from 35.9 ms to 15.8 ms and lowering energy per 2.56 s window from 14200 mJ to 10750 mJ on the Raspberry Pi 4B gateway.

View on arXiv PDF

Similar