CLAIJun 18, 2018

Unsupervised Word Segmentation from Speech with Attention

arXiv:1806.06734v129 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of language documentation for unwritten languages, though it is an incremental approach building on existing translation and acoustic unit discovery techniques.

The paper tackles the problem of automatically identifying lexical units in low-resource unwritten languages by performing attentional word segmentation directly from speech signals, using a method that pairs recordings with translations in a well-resourced language and achieves results comparable to monolingual and bilingual baselines on the Mboshi Bantu language.

We present a first attempt to perform attentional word segmentation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-phones that is segmented using neural soft-alignments produced by a neural machine translation model. Evaluation uses an actual Bantu UL, Mboshi; comparisons to monolingual and bilingual baselines illustrate the potential of attentional word segmentation for language documentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes