CLLGMay 24, 2020

Jointly Encoding Word Confusion Network and Dialogue Context with BERT for Spoken Language Understanding

arXiv:2005.11640v319 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of ASR errors in SLU for dialogue systems, representing an incremental improvement by integrating existing techniques like BERT with WCNs and context.

The paper tackled the problem of ASR errors degrading Spoken Language Understanding (SLU) performance by proposing a BERT-based model that jointly encodes word confusion networks and dialogue context, achieving significant improvements over previous state-of-the-art models on the DSTC2 benchmark.

Spoken Language Understanding (SLU) converts hypotheses from automatic speech recognizer (ASR) into structured semantic representations. ASR recognition errors can severely degenerate the performance of the subsequent SLU module. To address this issue, word confusion networks (WCNs) have been used to encode the input for SLU, which contain richer information than 1-best or n-best hypotheses list. To further eliminate ambiguity, the last system act of dialogue context is also utilized as additional input. In this paper, a novel BERT based SLU model (WCN-BERT SLU) is proposed to encode WCNs and the dialogue context jointly. It can integrate both structural information and ASR posterior probabilities of WCNs in the BERT architecture. Experiments on DSTC2, a benchmark of SLU, show that the proposed method is effective and can outperform previous state-of-the-art models significantly.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes