CLJul 29, 2016

A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique

arXiv:1607.08692v22 citations
Originality Incremental advance
AI Analysis

This work addresses bilingual lexicon translation for natural language processing, but it is incremental as it builds on existing embedding methods with a novel sense-based approach.

The paper tackles the problem of bilingual word embedding by introducing a Bilingual Sense Clique (BSC) derived from a graph over bilingual corpus, eliminating the need for separate projection processing. Empirical results show that this method outperforms existing bilingual word embedding methods on lexicon translation tasks.

Most of the existing methods for bilingual word embedding only consider shallow context or simple co-occurrence information. In this paper, we propose a latent bilingual sense unit (Bilingual Sense Clique, BSC), which is derived from a maximum complete sub-graph of pointwise mutual information based graph over bilingual corpus. In this way, we treat source and target words equally and a separated bilingual projection processing that have to be used in most existing works is not necessary any more. Several dimension reduction methods are evaluated to summarize the BSC-word relationship. The proposed method is evaluated on bilingual lexicon translation tasks and empirical results show that bilingual sense embedding methods outperform existing bilingual word embedding methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes