Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?
This addresses punctuation restoration for speech recognition systems, but it is incremental as it builds on existing methods with domain-specific adaptations.
The paper tackled the problem of punctuation prediction in spontaneous conversations being confused by ASR errors, particularly homonyms, by using retrofitted word embeddings to better align homonym embeddings, resulting in absolute accuracy improvements of 6.2% for question marks and 9% for periods compared to the state-of-the-art model.
Automatic Speech Recognition (ASR) systems introduce word errors, which often confuse punctuation prediction models, turning punctuation restoration into a challenging task. These errors usually take the form of homonyms. We show how retrofitting of the word embeddings on the domain-specific data can mitigate ASR errors. Our main contribution is a method for better alignment of homonym embeddings and the validation of the presented method on the punctuation prediction task. We record the absolute improvement in punctuation prediction accuracy between 6.2% (for question marks) to 9% (for periods) when compared with the state-of-the-art model.