Resolving Legalese: A Multilingual Exploration of Negation Scope Resolution in Legal Documents
This addresses the challenge of accurately interpreting negations in legal texts for NLP applications, though it is incremental as it builds on existing methods with new domain-specific data.
The paper tackled the problem of negation scope resolution in multilingual legal documents by creating a new annotated dataset in German, French, and Italian, achieving token-level F1-scores of up to 86.7% in zero-shot cross-lingual experiments and up to 91.1% in multilingual experiments.
Resolving the scope of a negation within a sentence is a challenging NLP task. The complexity of legal texts and the lack of annotated in-domain negation corpora pose challenges for state-of-the-art (SotA) models when performing negation scope resolution on multilingual legal data. Our experiments demonstrate that models pre-trained without legal data underperform in the task of negation scope resolution. Our experiments, using language models exclusively fine-tuned on domains like literary texts and medical data, yield inferior results compared to the outcomes documented in prior cross-domain experiments. We release a new set of annotated court decisions in German, French, and Italian and use it to improve negation scope resolution in both zero-shot and multilingual settings. We achieve token-level F1-scores of up to 86.7% in our zero-shot cross-lingual experiments, where the models are trained on two languages of our legal datasets and evaluated on the third. Our multilingual experiments, where the models were trained on all available negation data and evaluated on our legal datasets, resulted in F1-scores of up to 91.1%.