CLAIApr 15

Generating Concept Lexicalizations via Dictionary-Based Cross-Lingual Sense Projection

arXiv:2604.1439760.6h-index: 5
AI Analysis

For computational linguists, this offers an interpretable method to generate sense inventories for new languages with improved precision.

The paper tackles automatic expansion of WordNet-style lexical resources to new languages via sense generation. The proposed project-and-filter strategy improves precision over prior methods while requiring few external resources.

We study the task of automatically expanding WordNet-style lexical resources to new languages through sense generation. We generate senses by associating target-language lemmas with existing lexical concepts via semantic projection. Given a sense-tagged English corpus and its translation, our method projects English synsets onto aligned target-language tokens and assigns the corresponding lemmas to those synsets. To generate these alignments and ensure their quality, we augment a pre-trained base aligner with a bilingual dictionary, which is also used to filter out incorrect sense projections. We evaluate the method on multiple languages, comparing it to prior methods, as well as dictionary-based and large language model baselines. Results show that the proposed project-and-filter strategy improves precision while remaining interpretable and requiring few external resources. We plan to make our code, documentation, and generated sense inventories accessible.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes