The Logoscope: a Semi-Automatic Tool for Detecting and Documenting French New Words
This tool addresses the need for comprehensive documentation of new words in French, primarily for linguists and lexicographers, but it is incremental as it builds on existing dictionary development tools by adding contextual analysis.
The authors tackled the problem of detecting and documenting new French words by developing the Logoscope, a semi-automatic tool that collects words daily from major French newspapers and provides contextual information, including journalistic topics, resulting in the first tool specifically designed for this purpose.
In this article we present the design and implementation of the Logoscope, the first tool especially developed to detect new words of the French language, to document them and allow a public access through a web interface. This semi-automatic tool collects new words daily by browsing the online versions of French well known newspapers such as Le Monde, Le Figaro, L'Equipe, Libération, La Croix, Les Échos. In contrast to other existing tools essentially dedicated to dictionary development, the Logoscope attempts to give a more complete account of the context in which the new words occur. In addition to the commonly given morpho-syntactic information it also provides information about the textual and discursive contexts of the word creation; in particular, it automatically determines the (journalistic) topics of the text containing the new word. In this article we first give a general overview of the developed tool. We then describe the approach taken, we discuss the linguistic background which guided our design decisions and present the computational methods we used to implement it.