CLLGASOct 25, 2019

Exploring Multilingual Syntactic Sentence Representations

arXiv:1910.11768v1996 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient syntactic representation learning for low-resource languages, demonstrating transfer learning but is incremental in method.

The paper tackled the problem of learning sentence embeddings with syntactic structure using a multilingual parallel corpus augmented by Universal POS tags, and the result showed that these embeddings can be learned with less training data, fewer parameters, and better evaluation metrics than state-of-the-art language models.

We study methods for learning sentence embeddings with syntactic structure. We focus on methods of learning syntactic sentence-embeddings by using a multilingual parallel-corpus augmented by Universal Parts-of-Speech tags. We evaluate the quality of the learned embeddings by examining sentence-level nearest neighbours and functional dissimilarity in the embedding space. We also evaluate the ability of the method to learn syntactic sentence-embeddings for low-resource languages and demonstrate strong evidence for transfer learning. Our results show that syntactic sentence-embeddings can be learned while using less training data, fewer model parameters, and resulting in better evaluation metrics than state-of-the-art language models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes