CLAug 9, 2022

Where's the Learning in Representation Learning for Compositional Semantics and the Case of Thematic Fit

arXiv:2208.04749v3289 citationsh-index: 17
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of understanding representation learning in compositional semantics for NLP researchers, but it is incremental as it builds on existing embedding and task analysis.

The paper investigates why random embeddings sometimes match pretrained ones in NLP tasks like semantic role prediction and thematic fit estimation, finding that learning depends on task relation to training objectives and shows non-monotonic performance with data size.

Observing that for certain NLP tasks, such as semantic role prediction or thematic fit estimation, random embeddings perform as well as pretrained embeddings, we explore what settings allow for this and examine where most of the learning is encoded: the word embeddings, the semantic role embeddings, or ``the network''. We find nuanced answers, depending on the task and its relation to the training objective. We examine these representation learning aspects in multi-task learning, where role prediction and role-filling are supervised tasks, while several thematic fit tasks are outside the models' direct supervision. We observe a non-monotonous relation between some tasks' quality score and the training data size. In order to better understand this observation, we analyze these results using easier, per-verb versions of these tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes