CLAIApr 22, 2018

Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks

arXiv:1804.08139v130 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of limited training data and inconsistent representations in NLP, offering a method to enhance sentence representation learning for various downstream tasks.

The paper tackles the problem of sentence representation varying across tasks by proposing a multi-task learning scheme where all tasks share the same representation and use task-specific attention mechanisms to select relevant information, achieving improved performance on 16 text classification tasks.

Distributed representation plays an important role in deep learning based natural language processing. However, the representation of a sentence often varies in different tasks, which is usually learned from scratch and suffers from the limited amounts of training data. In this paper, we claim that a good sentence representation should be invariant and can benefit the various subsequent tasks. To achieve this purpose, we propose a new scheme of information sharing for multi-task learning. More specifically, all tasks share the same sentence representation and each task can select the task-specific information from the shared sentence representation with attention mechanism. The query vector of each task's attention could be either static parameters or generated dynamically. We conduct extensive experiments on 16 different text classification tasks, which demonstrate the benefits of our architecture.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes