Disfluency Detection using a Noisy Channel Model and a Deep Neural Language Model
This work addresses disfluency detection for speech processing applications, representing an incremental improvement over existing methods.
The paper tackled disfluency detection in spontaneous speech transcripts by proposing an LSTM Noisy Channel Model that combines a noisy channel model with an LSTM language model for reranking, resulting in improved state-of-the-art performance.
This paper presents a model for disfluency detection in spontaneous speech transcripts called LSTM Noisy Channel Model. The model uses a Noisy Channel Model (NCM) to generate n-best candidate disfluency analyses and a Long Short-Term Memory (LSTM) language model to score the underlying fluent sentences of each analysis. The LSTM language model scores, along with other features, are used in a MaxEnt reranker to identify the most plausible analysis. We show that using an LSTM language model in the reranking process of noisy channel disfluency model improves the state-of-the-art in disfluency detection.