CLMay 24, 2023

Automated Refugee Case Analysis: An NLP Pipeline for Supporting Legal Practitioners

arXiv:2305.15533v122 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient case retrieval and analysis for legal practitioners in refugee law, though it is incremental as it extends existing NER models to a new domain.

The paper tackled the problem of automating information extraction from refugee law cases in Canada by developing an NLP pipeline, achieving F1 scores above 90% on five categories and over 80% on four others using domain-specific pre-training.

In this paper, we introduce an end-to-end pipeline for retrieving, processing, and extracting targeted information from legal cases. We investigate an under-studied legal domain with a case study on refugee law in Canada. Searching case law for past similar cases is a key part of legal work for both lawyers and judges, the potential end-users of our prototype. While traditional named-entity recognition labels such as dates provide meaningful information in legal work, we propose to extend existing models and retrieve a total of 19 useful categories of items from refugee cases. After creating a novel data set of cases, we perform information extraction based on state-of-the-art neural named-entity recognition (NER). We test different architectures including two transformer models, using contextual and non-contextual embeddings, and compare general purpose versus domain-specific pre-training. The results demonstrate that models pre-trained on legal data perform best despite their smaller size, suggesting that domain matching had a larger effect than network architecture. We achieve a F1 score above 90% on five of the targeted categories and over 80% on four further categories.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes