Yixing Luan

2papers

2 Papers

CLJun 11, 2021
Semi-Supervised and Unsupervised Sense Annotation via Translations

Bradley Hauer, Grzegorz Kondrak, Yixing Luan et al.

Acquisition of multilingual training data continues to be a challenge in word sense disambiguation (WSD). To address this problem, unsupervised approaches have been proposed to automatically generate sense annotations for training supervised WSD systems. We present three new methods for creating sense-annotated corpora which leverage translations, parallel bitexts, lexical resources, as well as contextual and synset embeddings. Our semi-supervised method applies machine translation to transfer existing sense annotations to other languages. Our two unsupervised methods refine sense annotations produced by a knowledge-based WSD system via lexical translations in a parallel corpus. We obtain state-of-the-art results on standard WSD benchmarks.

CLAug 21, 2018
You Shall Know the Most Frequent Sense by the Company it Keeps

Bradley Hauer, Yixing Luan, Grzegorz Kondrak

Identification of the most frequent sense of a polysemous word is an important semantic task. We introduce two concepts that can benefit MFS detection: companions, which are the most frequently co-occurring words, and the most frequent translation in a bitext. We present two novel methods that incorporate these new concepts, and show that they advance the state of the art on MFS detection.