AIMar 8, 2024

Medical Speech Symptoms Classification via Disentangled Representation

arXiv:2403.05000v3h-index: 22CSCWD
Originality Highly original
AI Analysis

This work addresses symptomatic diagnosis in medical speech for healthcare applications, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackled the problem of classifying medical speech symptoms by disentangling intent and content representations from textual-acoustic data, achieving an average accuracy of 95% in detecting 25 different symptoms.

Intent is defined for understanding spoken language in existing works. Both textual features and acoustic features involved in medical speech contain intent, which is important for symptomatic diagnosis. In this paper, we propose a medical speech classification model named DRSC that automatically learns to disentangle intent and content representations from textual-acoustic data for classification. The intent representations of the text domain and the Mel-spectrogram domain are extracted via intent encoders, and then the reconstructed text feature and the Mel-spectrogram feature are obtained through two exchanges. After combining the intent from two domains into a joint representation, the integrated intent representation is fed into a decision layer for classification. Experimental results show that our model obtains an average accuracy rate of 95% in detecting 25 different medical symptoms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes