CLSep 29, 2023

Few-Shot Domain Adaptation for Charge Prediction on Unprofessional Descriptions

arXiv:2309.17313v14 citationsh-index: 82
Originality Highly original
AI Analysis

This work addresses the domain discrepancy in legal charge prediction for layperson users, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles the problem of charge prediction on unprofessional legal descriptions by proposing a few-shot domain adaptation method that disentangles content and style representations, achieving superior performance on a new non-PLLS dataset.

Recent works considering professional legal-linguistic style (PLLS) texts have shown promising results on the charge prediction task. However, unprofessional users also show an increasing demand on such a prediction service. There is a clear domain discrepancy between PLLS texts and non-PLLS texts expressed by those laypersons, which degrades the current SOTA models' performance on non-PLLS texts. A key challenge is the scarcity of non-PLLS data for most charge classes. This paper proposes a novel few-shot domain adaptation (FSDA) method named Disentangled Legal Content for Charge Prediction (DLCCP). Compared with existing FSDA works, which solely perform instance-level alignment without considering the negative impact of text style information existing in latent features, DLCCP (1) disentangles the content and style representations for better domain-invariant legal content learning with carefully designed optimization goals for content and style spaces and, (2) employs the constitutive elements knowledge of charges to extract and align element-level and instance-level content representations simultaneously. We contribute the first publicly available non-PLLS dataset named NCCP for developing layperson-friendly charge prediction models. Experiments on NCCP show the superiority of our methods over competitive baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes