On the Role of Style in Parsing Speech with Neural Models
This addresses the challenge of parsing conversational speech for natural language processing applications, though it is incremental in leveraging existing neural methods.
The paper tackled the problem of parsing spontaneous speech, where previous parsers trained on written text performed poorly, and showed that neural approaches using written text can improve parsing of spontaneous speech, with prosody further boosting performance by 2.1% in F1 score.
The differences in written text and conversational speech are substantial; previous parsers trained on treebanked text have given very poor results on spontaneous speech. For spoken language, the mismatch in style also extends to prosodic cues, though it is less well understood. This paper re-examines the use of written text in parsing speech in the context of recent advances in neural language processing. We show that neural approaches facilitate using written text to improve parsing of spontaneous speech, and that prosody further improves over this state-of-the-art result. Further, we find an asymmetric degradation from read vs. spontaneous mismatch, with spontaneous speech more generally useful for training parsers.