CLAIApr 8, 2024

Language-Independent Representations Improve Zero-Shot Summarization

arXiv:2404.05720v130 citationsh-index: 9Has CodeNAACL
Originality Incremental advance
AI Analysis

This work addresses the issue of poor zero-shot performance in summarization for multilingual applications, though it appears incremental as it builds on existing finetuning and adversarial methods.

The paper tackled the problem of catastrophic forgetting in zero-shot summarization by using language-independent representations, achieving improved performance in zero-shot transfer to new languages or language pairs.

Finetuning pretrained models on downstream generation tasks often leads to catastrophic forgetting in zero-shot conditions. In this work, we focus on summarization and tackle the problem through the lens of language-independent representations. After training on monolingual summarization, we perform zero-shot transfer to new languages or language pairs. We first show naively finetuned models are highly language-specific in both output behavior and internal representations, resulting in poor zero-shot performance. Next, we propose query-key (QK) finetuning to decouple task-specific knowledge from the pretrained language generation abilities. Then, after showing downsides of the standard adversarial language classifier, we propose a balanced variant that more directly enforces language-agnostic representations. Moreover, our qualitative analyses show removing source language identity correlates to zero-shot summarization performance. Our code is openly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes