ASLGSDMay 23, 2023

Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification

arXiv:2305.14032v567 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of scarce medical data for contact-free diagnosis of lung diseases, representing an incremental advance in deep learning for healthcare.

The study tackled respiratory sound classification for diagnosing lung diseases by generalizing a pretrained model to this task and introducing Patch-Mix augmentation with contrastive learning, achieving state-of-the-art performance with a 4.08% improvement on the ICBHI dataset.

Respiratory sound contains crucial information for the early diagnosis of fatal lung diseases. Since the COVID-19 pandemic, there has been a growing interest in contact-free medical care based on electronic stethoscopes. To this end, cutting-edge deep learning models have been developed to diagnose lung diseases; however, it is still challenging due to the scarcity of medical data. In this study, we demonstrate that the pretrained model on large-scale visual and audio datasets can be generalized to the respiratory sound classification task. In addition, we introduce a straightforward Patch-Mix augmentation, which randomly mixes patches between different samples, with Audio Spectrogram Transformer (AST). We further propose a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space. Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes