CLDec 19, 2017

Subword and Crossword Units for CTC Acoustic Models

arXiv:1712.06855v233 citations
Originality Incremental advance
AI Analysis

This addresses the trade-off between unit set size and training data in speech recognition, but it is incremental as it builds on existing CTC and language model methods.

The paper tackles the problem of selecting unit sets for CTC-based speech recognition by using Byte Pair Encoding to learn units of arbitrary size, achieving state-of-the-art results for grapheme-based CTC systems.

This paper proposes a novel approach to create an unit set for CTC based speech recognition systems. By using Byte Pair Encoding we learn an unit set of an arbitrary size on a given training text. In contrast to using characters or words as units this allows us to find a good trade-off between the size of our unit set and the available training data. We evaluate both Crossword units, that may span multiple word, and Subword units. By combining this approach with decoding methods using a separate language model we are able to achieve state of the art results for grapheme based CTC systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes