ASCLSDJan 19, 2024

Multilingual acoustic word embeddings for zero-resource languages

arXiv:2401.10543v22 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of enabling speech technology for languages lacking labelled data, with incremental improvements in acoustic word embedding methods.

The research tackled developing speech applications for zero-resource languages by using multilingual acoustic word embeddings, introducing a neural network that outperformed existing models and applying it to keyword-spotting for hate speech detection in Swahili radio broadcasts.

This research addresses the challenge of developing speech applications for zero-resource languages that lack labelled data. It specifically uses acoustic word embedding (AWE) -- fixed-dimensional representations of variable-duration speech segments -- employing multilingual transfer, where labelled data from several well-resourced languages are used for pertaining. The study introduces a new neural network that outperforms existing AWE models on zero-resource languages. It explores the impact of the choice of well-resourced languages. AWEs are applied to a keyword-spotting system for hate speech detection in Swahili radio broadcasts, demonstrating robustness in real-world scenarios. Additionally, novel semantic AWE models improve semantic query-by-example search.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes