SPAug 1, 2024
A deep learning-enabled smart garment for accurate and versatile sleep conditions monitoring in daily lifeChenyu Tang, Wentian Yi, Muzi Xu et al.
In wearable smart systems, continuous monitoring and accurate classification of different sleep-related conditions are critical for enhancing sleep quality and preventing sleep-related chronic conditions. However, the requirements for device-skin coupling quality in electrophysiological sleep monitoring systems hinder the comfort and reliability of night wearing. Here, we report a washable, skin-compatible smart garment sleep monitoring system that captures local skin strain signals under weak device-skin coupling conditions without positioning or skin preparation requirements. A printed textile-based strain sensor array responds to strain from 0.1% to 10% with a gauge factor as high as 100 and shows independence to extrinsic motion artefacts via strain-isolating printed pattern design. Through reversible starching treatment, ink penetration depth during direct printing on garments is controlled to achieve batch-to-batch performance variation < 10%. Coupled with deep learning, explainable artificial intelligence (XAI), and transfer learning data processing, the smart garment is capable of classifying six sleep states with an accuracy of 98.6%, maintaining excellent explainability (classification with low bias) and generalization (95% accuracy on new users with few-shot learning less than 15 samples per class) in practical applications, paving the way for next-generation daily sleep healthcare management.
45.5CVMay 10
LiteMedCoT-VL: Parameter-Efficient Adaptation for Medical Visual Question AnsweringRunze Ma, Shunbo Jia, Haonan Lyu et al.
The reasoning gap between large and compact vision-language models (VLMs) limits the deployment of medical AI on portable clinical devices. Compact VLMs of 2--4B parameters can run on resource-constrained hardware but lack the multi-step reasoning capacity needed for interpretable clinical decision support. Existing knowledge distillation methods transfer answers without the reasoning process behind them. Medical visual question answering (VQA) serves as a testbed for this problem, as it requires models to integrate visual evidence with clinical knowledge through structured reasoning chains. We introduce LiteMedCoT-VL, a pipeline that transfers chain-of-thought reasoning from a 235B teacher model to 2B student models through LoRA-based fine-tuning on explanation-enriched training data. All inference is conducted without image captions by default, simulating the clinical scenario in which a physician interprets a medical image directly without an accompanying radiology report. On the PMC-VQA benchmark, LiteMedCoT-VL achieves 64.9% accuracy, exceeding the zero-shot Qwen3-VL-4B baseline of 53.9% by 11.0 percentage points and outperforming all published baselines. This result indicates that a 2B model with reasoning distillation can match or exceed models with twice the parameters. Visual grounding analysis shows that the model relies on image content rather than exploiting textual priors. Our code is publicly available at https://anonymous.4open.science/r/LiteMedCoT-VL.
LGJan 12
BEAT-Net: Injecting Biomimetic Spatio-Temporal Priors for Interpretable ECG ClassificationRunze Ma, Caizhi Liao
Although deep learning has advanced automated electrocardiogram (ECG) diagnosis, prevalent supervised methods typically treat recordings as undifferentiated one-dimensional (1D) signals or two-dimensional (2D) images. This formulation compels models to learn physiological structures implicitly, resulting in data inefficiency and opacity that diverge from medical reasoning. To address these limitations, we propose BEAT-Net, a Biomimetic ECG Analysis with Tokenization framework that reformulates the problem as a language modeling task. Utilizing a QRS tokenization strategy to transform continuous signals into biologically aligned heartbeat sequences, the architecture explicitly decomposes cardiac physiology through specialized encoders that extract local beat morphology while normalizing spatial lead perspectives and modeling temporal rhythm dependencies. Evaluations across three large-scale benchmarks demonstrate that BEAT-Net matches the diagnostic accuracy of dominant convolutional neural network (CNN) architectures while substantially improving robustness. The framework exhibits exceptional data efficiency, recovering fully supervised performance using only 30 to 35 percent of annotated data. Moreover, learned attention mechanisms provide inherent interpretability by spontaneously reproducing clinical heuristics, such as Lead II prioritization for rhythm analysis, without explicit supervision. These findings indicate that integrating biological priors offers a computationally efficient and interpretable alternative to data-intensive large-scale pre-training.
LGDec 31, 2025
CPR: Causal Physiological Representation Learning for Robust ECG Analysis under Distribution ShiftsShunbo Jia, Caizhi Liao
Deep learning models for Electrocardiogram (ECG) diagnosis have achieved remarkable accuracy but exhibit fragility against adversarial perturbations, particularly Smooth Adversarial Perturbations (SAP) that mimic biological morphology. Existing defenses face a critical dilemma: Adversarial Training (AT) provides robustness but incurs a prohibitive computational burden, while certified methods like Randomized Smoothing (RS) introduce significant inference latency, rendering them impractical for real-time clinical monitoring. We posit that this vulnerability stems from the models' reliance on non-robust spurious correlations rather than invariant pathological features. To address this, we propose Causal Physiological Representation Learning (CPR). Unlike standard denoising approaches that operate without semantic constraints, CPR incorporates a Physiological Structural Prior within a causal disentanglement framework. By modeling ECG generation via a Structural Causal Model (SCM), CPR enforces a structural intervention that strictly separates invariant pathological morphology (P-QRS-T complex) from non-causal artifacts. Empirical results on PTB-XL demonstrate that CPR significantly outperforms standard clinical preprocessing methods. Specifically, under SAP attacks, CPR achieves an F1 score of 0.632, surpassing Median Smoothing (0.541 F1) by 9.1%. Crucially, CPR matches the certified robustness of Randomized Smoothing while maintaining single-pass inference efficiency, offering a superior trade-off between robustness, efficiency, and clinical interpretability.