CLDLJun 22, 2022

Connecting a French Dictionary from the Beginning of the 20th Century to Wikidata

arXiv:2206.11022v3588 citationsh-index: 30Has Code
Originality Synthesis-oriented
AI Analysis

This work provides a resource for researchers in digital humanities and lexicography to analyze historical cultural data more efficiently, though it is incremental as it applies existing linking methods to a specific dataset.

The authors tackled the problem of manually verifying historical entries in the 1905 Petit Larousse illustré dictionary by connecting 20,245 entries from its history and geography part to Wikidata identifiers, enabling automated identification, comparison, and verification of culturally significant representations.

The \textit{Petit Larousse illustré} is a French dictionary first published in 1905. Its division in two main parts on language and on history and geography corresponds to a major milestone in French lexicography as well as a repository of general knowledge from this period. Although the value of many entries from 1905 remains intact, some descriptions now have a dimension that is more historical than contemporary. They are nonetheless significant to analyze and understand cultural representations from this time. A comparison with more recent information or a verification of these entries would require a tedious manual work. In this paper, we describe a new lexical resource, where we connected all the dictionary entries of the history and geography part to current data sources. For this, we linked each of these entries to a wikidata identifier. Using the wikidata links, we can automate more easily the identification, comparison, and verification of historically-situated representations. We give a few examples on how to process wikidata identifiers and we carried out a small analysis of the entities described in the dictionary to outline possible applications. The resource, i.e. the annotation of 20,245 dictionary entries with wikidata links, is available from GitHub url{https://github.com/pnugues/petit_larousse_1905/

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes