CLLGJun 6, 2019

Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections

arXiv:1906.02656v41096 citations
Originality Incremental advance
AI Analysis

This addresses the problem of building syntactic analysis tools for low-resource distant languages, representing a strong specific gain but incremental in method.

The paper tackled cross-lingual syntactic transfer to typologically distant languages without annotated target data or parallel corpora, achieving an average 5.2% absolute improvement in POS tagging and 8.3% in dependency parsing over a direct transfer method.

Cross-lingual transfer is an effective way to build syntactic analysis tools in low-resource languages. However, transfer is difficult when transferring to typologically distant languages, especially when neither annotated target data nor parallel corpora are available. In this paper, we focus on methods for cross-lingual transfer to distant languages and propose to learn a generative model with a structured prior that utilizes labeled source data and unlabeled target data jointly. The parameters of source model and target model are softly shared through a regularized log likelihood objective. An invertible projection is employed to learn a new interlingual latent embedding space that compensates for imperfect cross-lingual word embedding input. We evaluate our method on two syntactic tasks: part-of-speech (POS) tagging and dependency parsing. On the Universal Dependency Treebanks, we use English as the only source corpus and transfer to a wide range of target languages. On the 10 languages in this dataset that are distant from English, our method yields an average of 5.2% absolute improvement on POS tagging and 8.3% absolute improvement on dependency parsing over a direct transfer method using state-of-the-art discriminative models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes