CLJun 13, 2024

CoastTerm: a Corpus for Multidisciplinary Term Extraction in Coastal Scientific Literature

arXiv:2406.09128v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for automated term analysis to support environmental policy-making in coastal areas, though it is incremental as it builds on existing frameworks and models.

The authors tackled the problem of extracting and classifying multidisciplinary terms from coastal scientific literature by introducing a specialized corpus of 2,491 sentences, achieving F1 scores of approximately 80% for term extraction and 70% for term and label extraction.

The growing impact of climate change on coastal areas, particularly active but fragile regions, necessitates collaboration among diverse stakeholders and disciplines to formulate effective environmental protection policies. We introduce a novel specialized corpus comprising 2,491 sentences from 410 scientific abstracts concerning coastal areas, for the Automatic Term Extraction (ATE) and Classification (ATC) tasks. Inspired by the ARDI framework, focused on the identification of Actors, Resources, Dynamics and Interactions, we automatically extract domain terms and their distinct roles in the functioning of coastal systems by leveraging monolingual and multilingual transformer models. The evaluation demonstrates consistent results, achieving an F1 score of approximately 80\% for automated term extraction and F1 of 70\% for extracting terms and their labels. These findings are promising and signify an initial step towards the development of a specialized Knowledge Base dedicated to coastal areas.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes