CLJan 30, 2023

Active Learning for Multilingual Semantic Parser

arXiv:2301.12920v4269 citationsh-index: 44
Originality Incremental advance
AI Analysis

This work addresses the problem of costly dataset creation for multilingual semantic parsing, offering a practical solution for researchers and practitioners in NLP, though it is incremental as it builds on existing active learning and parsing methods.

The paper tackles the high cost of manual translation in multilingual semantic parsing by proposing the first active learning procedure (AL-MSP) that selects a subset of examples for translation, reducing translation effort while achieving better parsing performance than baselines on two datasets.

Current multilingual semantic parsing (MSP) datasets are almost all collected by translating the utterances in the existing datasets from the resource-rich language to the target language. However, manual translation is costly. To reduce the translation effort, this paper proposes the first active learning procedure for MSP (AL-MSP). AL-MSP selects only a subset from the existing datasets to be translated. We also propose a novel selection method that prioritizes the examples diversifying the logical form structures with more lexical choices, and a novel hyperparameter tuning method that needs no extra annotation cost. Our experiments show that AL-MSP significantly reduces translation costs with ideal selection methods. Our selection method with proper hyperparameters yields better parsing performance than the other baselines on two multilingual datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes