CLSep 13, 2021

When is Wall a Pared and when a Muro? -- Extracting Rules Governing Lexical Selection

arXiv:2109.06014v14 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the challenge for non-native language learners in mastering subtle vocabulary differences, though it is incremental as it builds on existing methods for lexical analysis.

The paper tackles the problem of automatically identifying fine-grained lexical distinctions, such as when to use 'pared' vs. 'muro' for 'wall' in Spanish, and extracts concise descriptions to explain these distinctions. It confirms the quality of these descriptions in a language learning setup for Spanish and Greek, showing they can teach non-native speakers when to translate ambiguous words into different translations.

Learning fine-grained distinctions between vocabulary items is a key challenge in learning a new language. For example, the noun "wall" has different lexical manifestations in Spanish -- "pared" refers to an indoor wall while "muro" refers to an outside wall. However, this variety of lexical distinction may not be obvious to non-native learners unless the distinction is explained in such a way. In this work, we present a method for automatically identifying fine-grained lexical distinctions, and extracting concise descriptions explaining these distinctions in a human- and machine-readable format. We confirm the quality of these extracted descriptions in a language learning setup for two languages, Spanish and Greek, where we use them to teach non-native speakers when to translate a given ambiguous word into its different possible translations. Code and data are publicly released here (https://github.com/Aditi138/LexSelection)

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes