CLJan 9, 2020

Simulating Lexical Semantic Change from Sense-Annotated Data

arXiv:2001.03216v127 citations
AI Analysis

This work addresses the need for better evaluation benchmarks in computational linguistics, specifically for lexical semantic change detection, though it is incremental as it builds on existing synthetic data approaches.

The authors tackled the problem of evaluating lexical semantic change detection models by developing a simulation procedure that uses synchronic sense-annotated data to create a synthetic dataset, resulting in a more empirically aligned dataset and providing the first large-scale evaluation gold standard.

We present a novel procedure to simulate lexical semantic change from synchronic sense-annotated data, and demonstrate its usefulness for assessing lexical semantic change detection models. The induced dataset represents a stronger correspondence to empirically observed lexical semantic change than previous synthetic datasets, because it exploits the intimate relationship between synchronic polysemy and diachronic change. We publish the data and provide the first large-scale evaluation gold standard for LSC detection models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes