CL AIApr 15

Generating Concept Lexicalizations via Dictionary-Based Cross-Lingual Sense Projection

David Basil, Chirooth Girigowda, Bradley Hauer, Sahir Momin, Ning Shi, Grzegorz Kondrak

arXiv:2604.1439760.6h-index: 5

AI Analysis

For computational linguists, this offers an interpretable method to generate sense inventories for new languages with improved precision.

The paper tackles automatic expansion of WordNet-style lexical resources to new languages via sense generation. The proposed project-and-filter strategy improves precision over prior methods while requiring few external resources.

We study the task of automatically expanding WordNet-style lexical resources to new languages through sense generation. We generate senses by associating target-language lemmas with existing lexical concepts via semantic projection. Given a sense-tagged English corpus and its translation, our method projects English synsets onto aligned target-language tokens and assigns the corresponding lemmas to those synsets. To generate these alignments and ensure their quality, we augment a pre-trained base aligner with a bilingual dictionary, which is also used to filter out incorrect sense projections. We evaluate the method on multiple languages, comparing it to prior methods, as well as dictionary-based and large language model baselines. Results show that the proposed project-and-filter strategy improves precision while remaining interpretable and requiring few external resources. We plan to make our code, documentation, and generated sense inventories accessible.

View on arXiv PDF

Similar