SDLGJul 26, 2025

Improving Deep Learning-based Respiratory Sound Analysis with Frequency Selection and Attention Mechanism

arXiv:2507.20052v11 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses the need for reliable, real-time respiratory sound analysis in resource-constrained healthcare settings, representing an incremental improvement over existing methods.

The paper tackled the problem of accurately classifying respiratory sounds by proposing a CNN-Temporal Self-Attention network with a Frequency Band Selection module, which improved accuracy and reduced computational demands by up to 50%, achieving state-of-the-art performance on benchmark datasets.

Accurate classification of respiratory sounds requires deep learning models that effectively capture fine-grained acoustic features and long-range temporal dependencies. Convolutional Neural Networks (CNNs) are well-suited for extracting local time-frequency patterns but are limited in modeling global context. In contrast, transformer-based models can capture long-range dependencies, albeit with higher computational demands. To address these limitations, we propose a compact CNN-Temporal Self-Attention (CNN-TSA) network that integrates lightweight self-attention into an efficient CNN backbone. Central to our approach is a Frequency Band Selection (FBS) module that suppresses noisy and non-informative frequency regions, substantially improving accuracy and reducing FLOPs by up to 50%. We also introduce age-specific models to enhance robustness across diverse patient groups. Evaluated on the SPRSound-2022/2023 and ICBHI-2017 lung sound datasets, CNN-TSA with FBS sets new benchmarks on SPRSound and achieves state-of-the-art performance on ICBHI, all with a significantly smaller computational footprint. Furthermore, integrating FBS into an existing transformer baseline yields a new record on ICBHI, confirming FBS as an effective drop-in enhancement. These results demonstrate that our framework enables reliable, real-time respiratory sound analysis suitable for deployment in resource-constrained settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes