CLJan 9, 2020

Simulating Lexical Semantic Change from Sense-Annotated Data

Dominik Schlechtweg, Sabine Schulte im Walde

arXiv:2001.03216v11.927 citations

Originality Incremental advance

AI Analysis

This work addresses the need for better evaluation benchmarks in computational linguistics, specifically for lexical semantic change detection, though it is incremental as it builds on existing synthetic data approaches.

The authors tackled the problem of evaluating lexical semantic change detection models by developing a simulation procedure that uses synchronic sense-annotated data to create a synthetic dataset, resulting in a more empirically aligned dataset and providing the first large-scale evaluation gold standard.

We present a novel procedure to simulate lexical semantic change from synchronic sense-annotated data, and demonstrate its usefulness for assessing lexical semantic change detection models. The induced dataset represents a stronger correspondence to empirically observed lexical semantic change than previous synthetic datasets, because it exploits the intimate relationship between synchronic polysemy and diachronic change. We publish the data and provide the first large-scale evaluation gold standard for LSC detection models.

View on arXiv PDF

Similar