CLMay 18, 2020

Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

arXiv:2005.08866v21015 citations
AI Analysis

This work addresses slot-filling in conversational AI for domains like restaurant booking, offering an incremental improvement with a new dataset.

The paper tackles dialog slot-filling by framing it as a turn-based span extraction task, introducing Span-ConveRT, which leverages pretrained conversational models to achieve consistent gains in few-shot learning scenarios over baseline methods.

We introduce Span-ConveRT, a light-weight model for dialog slot-filling which frames the task as a turn-based span extraction task. This formulation allows for a simple integration of conversational knowledge coded in large pretrained conversational models such as ConveRT (Henderson et al., 2019). We show that leveraging such knowledge in Span-ConveRT is especially useful for few-shot learning scenarios: we report consistent gains over 1) a span extractor that trains representations from scratch in the target domain, and 2) a BERT-based span extractor. In order to inspire more work on span extraction for the slot-filling task, we also release RESTAURANTS-8K, a new challenging data set of 8,198 utterances, compiled from actual conversations in the restaurant booking domain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes