CL AIOct 20, 2016

Jointly Learning to Align and Convert Graphemes to Phonemes with Neural Attention Models

arXiv:1610.06540v17.343 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a key problem in speech and language processing for applications like text-to-speech, though it is incremental as it builds on existing attention mechanisms.

The authors tackled grapheme-to-phoneme conversion by proposing an attention-enabled encoder-decoder model that jointly learns alignments and conversions, achieving state-of-the-art results on three standard datasets (CMUDict, Pronlex, and NetTalk).

We propose an attention-enabled encoder-decoder model for the problem of grapheme-to-phoneme conversion. Most previous work has tackled the problem via joint sequence models that require explicit alignments for training. In contrast, the attention-enabled encoder-decoder model allows for jointly learning to align and convert characters to phonemes. We explore different types of attention models, including global and local attention, and our best models achieve state-of-the-art results on three standard data sets (CMUDict, Pronlex, and NetTalk).

View on arXiv PDF Code

Similar