Evaluating Word Embeddings for Sentence Boundary Detection in Speech Transcripts
This addresses sentence segmentation for automated neuropsychological testing in patients with cognitive impairment, but appears incremental as it focuses on comparing existing embedding methods.
The paper tackled the problem of sentence boundary detection in speech transcripts for neuropsychological discourse analysis by evaluating different word embedding methods (semantic, syntactic, morphological) to determine which works best, but no concrete results or numbers were provided.
This paper is motivated by the automation of neuropsychological tests involving discourse analysis in the retellings of narratives by patients with potential cognitive impairment. In this scenario the task of sentence boundary detection in speech transcripts is important as discourse analysis involves the application of Natural Language Processing tools, such as taggers and parsers, which depend on the sentence as a processing unit. Our aim in this paper is to verify which embedding induction method works best for the sentence boundary detection task, specifically whether it be those which were proposed to capture semantic, syntactic or morphological similarities.