Making Metadata Fit for Next Generation Language Technology Platforms: The Metadata Schema of the European Language Grid
This work addresses the need for standardized metadata to facilitate the management and usage of digital assets in the language technology domain, though it appears incremental as it builds on existing schemas and guidelines.
The paper tackles the problem of managing and sharing language resources and technologies by presenting ELG-SHARE, a metadata schema designed for the European Language Grid platform, which aims to serve as a hub for industry-relevant language technology in Europe.
The current scientific and technological landscape is characterised by the increasing availability of data resources and processing tools and services. In this setting, metadata have emerged as a key factor facilitating management, sharing and usage of such digital assets. In this paper we present ELG-SHARE, a rich metadata schema catering for the description of Language Resources and Technologies (processing and generation services and tools, models, corpora, term lists, etc.), as well as related entities (e.g., organizations, projects, supporting documents, etc.). The schema powers the European Language Grid platform that aims to be the primary hub and marketplace for industry-relevant Language Technology in Europe. ELG-SHARE has been based on various metadata schemas, vocabularies, and ontologies, as well as related recommendations and guidelines.