DLAISep 1, 2025

Animer une base de connaissance: des ontologies aux mod{è}les d'I.A. g{é}n{é}rative

arXiv:2509.01304v1h-index: 4
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of enhancing knowledge bases in humanities and social sciences with AI, though it appears incremental in applying existing methods to a specific domain.

The paper tackles the integration of generative AI tools into a knowledge base lifecycle for area studies, proposing a semiotic framework to combine symbolic and neural AI methods, and demonstrates this approach using the LaCAS ecosystem with 160,000 resources and specialized agents for tasks like data qualification and prompt engineering.

In a context where the social sciences and humanities are experimenting with non-anthropocentric analytical frames, this article proposes a semiotic (structural) reading of the hybridization between symbolic AI and neural (or sub-symbolic) AI based on a field of application: the design and use of a knowledge base for area studies. We describe the LaCAS ecosystem -- Open Archives in Linguistic and Cultural Studies (thesaurus; RDF/OWL ontology; LOD services; harvesting; expertise; publication), deployed at Inalco (National Institute for Oriental Languages and Civilizations) in Paris with the Okapi (Open Knowledge and Annotation Interface) software environment from Ina (National Audiovisual Institute), which now has around 160,000 documentary resources and ten knowledge macro-domains grouping together several thousand knowledge objects. We illustrate this approach using the knowledge domain ''Languages of the world'' (~540 languages) and the knowledge object ''Quechua (language)''. On this basis, we discuss the controlled integration of neural tools, more specifically generative tools, into the life cycle of a knowledge base: assistance with data localization/qualification, index extraction and aggregation, property suggestion and testing, dynamic file generation, and engineering of contextualized prompts (generic, contextual, explanatory, adjustment, procedural) aligned with a domain ontology. We outline an ecosystem of specialized agents capable of animating the database while respecting its symbolic constraints, by articulating model-driven and data-driven methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes