HCMar 7
Pre-Clinical Latency Characterization of VRxBioRelax: A Real-Time EMG Biofeedback System for Muscle Relaxation in Virtual RealityMelanie Baumgartner, Raphael Weibel, Tobias Hoesli et al.
Chronic tension in the upper trapezius (UT), often caused by poor ergonomics, prolonged posture, or psychological stress, contributes to musculoskeletal discomfort, headaches, and impaired interoceptive awareness. Although surface electromyography (sEMG) biofeedback can promote UT relaxation, traditional systems using conventional displays often fail to sustain engagement. Virtual reality (VR) offers a more immersive alternative, provided that latency remains below perceptual thresholds. We introduce VRxBioRelax, a closed-loop VR biofeedback system that streams sEMG data from Delsys Trigno Avanti sensors via MQTT to a Unity scene. Muscle activation drives a dynamic dawn-to-dusk landscape synchronized with a progressive muscle relaxation protocol. To validate system responsiveness, 87,716 EMG samples from the NinaPro DB2 dataset were replayed at $\sim$75 Hz. Timestamps at four key stages-acquisition, Root Mean Square (RMS) processing, network receipt, and rendering-revealed mean latencies of 0.50 ms (processing), 5.62 ms (network), and 19.22 ms (rendering), yielding an average end-to-end delay of 25.34 ms. Notably, 99.3% of frames arrived within 50 ms. One-sided t-tests confirmed mean latency was significantly lower than both the 30 ms VR comfort limit ($t_{87\,715}=-25.2$, $p=5.9{\times}10^{-140}$) and the 50 ms clinical benchmark ($t_{87\,715}=-133.3$, $p<10^{-300}$). These findings support VRxBioRelax for use in remote interoceptive training, stress reduction, and telepresence-enabled rehabilitation.
LGOct 3, 2025
RAxSS: Retrieval-Augmented Sparse Sampling for Explainable Variable-Length Medical Time Series ClassificationAydin Javadov, Samir Garibov, Tobias Hoesli et al.
Medical time series analysis is challenging due to data sparsity, noise, and highly variable recording lengths. Prior work has shown that stochastic sparse sampling effectively handles variable-length signals, while retrieval-augmented approaches improve explainability and robustness to noise and weak temporal correlations. In this study, we generalize the stochastic sparse sampling framework for retrieval-informed classification. Specifically, we weight window predictions by within-channel similarity and aggregate them in probability space, yielding convex series-level scores and an explicit evidence trail for explainability. Our method achieves competitive iEEG classification performance and provides practitioners with greater transparency and explainability. We evaluate our method in iEEG recordings collected in four medical centers, demonstrating its potential for reliable and explainable clinical variable-length time series classification.
CLSep 28, 2025
Compose and Fuse: Revisiting the Foundational Bottlenecks in Multimodal ReasoningYucheng Wang, Yifan Hou, Aydin Javadov et al. · eth-zurich
Multimodal large language models (MLLMs) promise enhanced reasoning by integrating diverse inputs such as text, vision, and audio. Yet cross-modal reasoning remains underexplored, with conflicting reports on whether added modalities help or harm performance. These inconsistencies stem from a lack of controlled evaluation frameworks and analysis of models' internals to isolate when and why modality interactions support or undermine reasoning. We address this gap through a logic-grounded evaluation framework that categorizes multimodal reasoning into six interaction patterns, varying how facts are distributed across modalities and logically combined. Empirically, additional modalities enhance reasoning only when they provide independent and sufficient reasoning paths, while redundant or chained entailment support often hurts performance. Moreover, reasoning degrades in three systematic ways: weaker modalities drag down overall performance, conflicts bias preference toward certain modalities, and joint signals from different modalities fail to be integrated effectively. Therefore, we identify two core failures: task-composition bottleneck, where recognition and reasoning cannot be jointly executed in one pass, and fusion bottleneck, where early integration introduces bias. For further investigation, we find that attention patterns fail to encode fact usefulness, but a simple two-step prompting (recognize then reason) restores performance, confirming the task-composition bottleneck. Moreover, modality identity remains recoverable in early layers, and softening attention in early fusion improves reasoning, highlighting biased fusion as another failure mode. Overall, our findings show that integration, not perception, is the main barrier to multimodal reasoning, suggesting composition-aware training and early fusion control as promising directions.