CLSDASJul 3, 2023

Semantic enrichment towards efficient speech representations

arXiv:2307.01323v12 citationsh-index: 31
Originality Synthesis-oriented
AI Analysis

This work addresses efficient speech representation for SLU, but it is incremental as it builds on existing SAMU-XLSR by specializing it on domain-specific data.

This study tackled the problem of improving semantic extraction for Spoken Language Understanding (SLU) tasks by specializing the SAMU-XLSR model on a small amount of transcribed data, aiming for better performance while considering computation costs.

Over the past few years, self-supervised learned speech representations have emerged as fruitful replacements for conventional surface representations when solving Spoken Language Understanding (SLU) tasks. Simultaneously, multilingual models trained on massive textual data were introduced to encode language agnostic semantics. Recently, the SAMU-XLSR approach introduced a way to make profit from such textual models to enrich multilingual speech representations with language agnostic semantics. By aiming for better semantic extraction on a challenging Spoken Language Understanding task and in consideration with computation costs, this study investigates a specific in-domain semantic enrichment of the SAMU-XLSR model by specializing it on a small amount of transcribed data from the downstream task. In addition, we show the benefits of the use of same-domain French and Italian benchmarks for low-resource language portability and explore cross-domain capacities of the enriched SAMU-XLSR.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes