CL ASFeb 12, 2021

Neural Inverse Text Normalization

Monica Sunkara, Chaitanya Shivade, Sravan Bodapati, Katrin Kirchhoff

arXiv:2102.06380v13.033 citations

Originality Incremental advance

AI Analysis

This addresses the scalability and language adaptability issues in ITN for ASR systems, though it is incremental as it builds on existing FST and neural techniques.

The paper tackles the problem of inverse text normalization (ITN) by proposing a neural solution using transformer-based seq2seq models and a hybrid framework with FSTs, which minimizes errors in ASR output and achieves lower WER across multiple languages, outperforming baselines on English, Spanish, German, and Italian datasets.

While there have been several contributions exploring state of the art techniques for text normalization, the problem of inverse text normalization (ITN) remains relatively unexplored. The best known approaches leverage finite state transducer (FST) based models which rely on manually curated rules and are hence not scalable. We propose an efficient and robust neural solution for ITN leveraging transformer based seq2seq models and FST-based text normalization techniques for data preparation. We show that this can be easily extended to other languages without the need for a linguistic expert to manually curate them. We then present a hybrid framework for integrating Neural ITN with an FST to overcome common recoverable errors in production environments. Our empirical evaluations show that the proposed solution minimizes incorrect perturbations (insertions, deletions and substitutions) to ASR output and maintains high quality even on out of domain data. A transformer based model infused with pretraining consistently achieves a lower WER across several datasets and is able to outperform baselines on English, Spanish, German and Italian datasets.

View on arXiv PDF

Similar