CLLGDec 15, 2021

Mask-combine Decoding and Classification Approach for Punctuation Prediction with real-time Inference Constraints

arXiv:2112.08098v2
AI Analysis

This work addresses punctuation prediction for real-time applications like speech recognition, but it is incremental as it builds on existing strategies and focuses on optimization.

The authors tackled punctuation prediction under real-time inference constraints by unifying existing decoding strategies and introducing a novel one that combines multiple predictions across windows, achieving significant improvements without retraining. They also compared tagging and classification approaches, finding classification beneficial when limited right-side context is available.

In this work, we unify several existing decoding strategies for punctuation prediction in one framework and introduce a novel strategy which utilises multiple predictions at each word across different windows. We show that significant improvements can be achieved by optimising these strategies after training a model, only leading to a potential increase in inference time, with no requirement for retraining. We further use our decoding strategy framework for the first comparison of tagging and classification approaches for punctuation prediction in a real-time setting. Our results show that a classification approach for punctuation prediction can be beneficial when little or no right-side context is available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes