CLAILGMLJan 18, 2018

Natural Language Multitasking: Analyzing and Improving Syntactic Saliency of Hidden Representations

arXiv:1801.06024v17 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding and improving syntactic saliency in hidden representations for natural language processing researchers, though it appears incremental as it builds on existing multi-task learning methods.

The authors trained multi-task autoencoders on linguistic tasks and found that adding more decoders, such as for translation and part-of-speech, improved clustering of sentences by syntactic similarity by making the representation space less entangled, with results including meaningful interpolations and vector manipulations.

We train multi-task autoencoders on linguistic tasks and analyze the learned hidden sentence representations. The representations change significantly when translation and part-of-speech decoders are added. The more decoders a model employs, the better it clusters sentences according to their syntactic similarity, as the representation space becomes less entangled. We explore the structure of the representation space by interpolating between sentences, which yields interesting pseudo-English sentences, many of which have recognizable syntactic structure. Lastly, we point out an interesting property of our models: The difference-vector between two sentences can be added to change a third sentence with similar features in a meaningful way.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes