Seq2seq for Automatic Paraphasia Detection in Aphasic Speech
This work addresses the time-consuming manual process of paraphasia detection for clinicians assessing aphasia, representing an incremental improvement over prior automated approaches.
The paper tackles automatic detection of paraphasias (speech errors) in aphasic speech by proposing a novel sequence-to-sequence model trained end-to-end for both speech recognition and detection tasks, outperforming previous state-of-the-art methods at word-level and utterance-level.
Paraphasias are speech errors that are often characteristic of aphasia and they represent an important signal in assessing disease severity and subtype. Traditionally, clinicians manually identify paraphasias by transcribing and analyzing speech-language samples, which can be a time-consuming and burdensome process. Identifying paraphasias automatically can greatly help clinicians with the transcription process and ultimately facilitate more efficient and consistent aphasia assessment. Previous research has demonstrated the feasibility of automatic paraphasia detection by training an automatic speech recognition (ASR) model to extract transcripts and then training a separate paraphasia detection model on a set of hand-engineered features. In this paper, we propose a novel, sequence-to-sequence (seq2seq) model that is trained end-to-end (E2E) to perform both ASR and paraphasia detection tasks. We show that the proposed model outperforms the previous state-of-the-art approach for both word-level and utterance-level paraphasia detection tasks and provide additional follow-up evaluations to further understand the proposed model behavior.