CLApr 12, 2016

Disfluency Detection using a Bidirectional LSTM

arXiv:1604.03209v1129 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of accurately detecting disfluencies in speech for natural language processing applications, representing an incremental improvement over existing methods.

The paper tackles disfluency detection in speech by proposing a Bidirectional LSTM model with pattern match features and integer linear programming, achieving state-of-the-art performance on the Switchboard corpus for both disfluency and correction detection tasks, with improved detection of non-repetition disfluencies.

We introduce a new approach for disfluency detection using a Bidirectional Long-Short Term Memory neural network (BLSTM). In addition to the word sequence, the model takes as input pattern match features that were developed to reduce sensitivity to vocabulary size in training, which lead to improved performance over the word sequence alone. The BLSTM takes advantage of explicit repair states in addition to the standard reparandum states. The final output leverages integer linear programming to incorporate constraints of disfluency structure. In experiments on the Switchboard corpus, the model achieves state-of-the-art performance for both the standard disfluency detection task and the correction detection task. Analysis shows that the model has better detection of non-repetition disfluencies, which tend to be much harder to detect.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes