CLLGOct 11, 2020

Unsupervised Distillation of Syntactic Information from Contextualized Word Representations

arXiv:2010.05265v2997 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of separating syntactic from semantic information in neural language models for improved parsing tasks, representing an incremental advancement in representation learning.

The paper tackled the problem of unsupervised disentanglement of semantics and structure in contextualized word representations by learning a transformation that discards lexical semantics but retains structural information, resulting in distilled representations that outperform original ones in few-shot parsing.

Contextualized word representations, such as ELMo and BERT, were shown to perform well on various semantic and syntactic tasks. In this work, we tackle the task of unsupervised disentanglement between semantics and structure in neural language representations: we aim to learn a transformation of the contextualized vectors, that discards the lexical semantics, but keeps the structural information. To this end, we automatically generate groups of sentences which are structurally similar but semantically different, and use metric-learning approach to learn a transformation that emphasizes the structural component that is encoded in the vectors. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics. Finally, we demonstrate the utility of our distilled representations by showing that they outperform the original contextualized representations in a few-shot parsing setting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes