CLSep 23, 2021

Corpus and Models for Lemmatisation and POS-tagging of Old French

arXiv:2109.11442v17 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a specific need for researchers in historical linguistics by providing tools for Old French, but it appears incremental as part of an ongoing project.

The paper tackles the problem of lemmatization and POS-tagging for Old French, an under-resourced historic language with high linguistic variation, by developing neural taggers and dedicated corpora, though no concrete performance numbers are provided.

Old French is a typical example of an under-resourced historic languages, that furtherly displays animportant amount of linguistic variation. In this paper, we present the current results of a long going project (2015-...) and describe how we broached the difficult question of providing lemmatisation andPOS models for Old French with the help of neural taggers and the progressive constitution of dedicated corpora.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes