CLJul 17, 2018

Low-Resource Contextual Topic Identification on Speech

arXiv:1807.06204v22 citations
AI Analysis

This work addresses the problem of topic identification in low-resource languages for applications like speech processing, though it appears incremental as it builds on existing methods with contextual enhancements.

The paper tackles topic identification on unstructured audio in low-resource languages by proposing a cascade method and an attention-based contextual model that leverages dependencies across segments, achieving significant performance improvements over context-independent models on most tested languages.

In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the context-independent models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes