CLSDASJan 11, 2020

Improving Spoken Language Understanding By Exploiting ASR N-best Hypotheses

arXiv:2001.05284v122 citations
Originality Incremental advance
AI Analysis

This work addresses performance issues in SLU systems for applications relying on speech recognition, but it is incremental as it builds on existing methods for handling ASR outputs.

The paper tackles the problem of ASR errors degrading spoken language understanding performance by proposing models that exploit n-best ASR hypotheses, resulting in improved semantic understanding.

In a modern spoken language understanding (SLU) system, the natural language understanding (NLU) module takes interpretations of a speech from the automatic speech recognition (ASR) module as the input. The NLU module usually uses the first best interpretation of a given speech in downstream tasks such as domain and intent classification. However, the ASR module might misrecognize some speeches and the first best interpretation could be erroneous and noisy. Solely relying on the first best interpretation could make the performance of downstream tasks non-optimal. To address this issue, we introduce a series of simple yet efficient models for improving the understanding of semantics of the input speeches by collectively exploiting the n-best speech interpretations from the ASR module.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes