IRDec 16, 2021

An Empirical Study on Transfer Learning for Privilege Review

arXiv:2112.08606v15 citations
Originality Synthesis-oriented
AI Analysis

This addresses the cost and scalability issues in legal privilege review for practitioners, but it is incremental as it applies existing methods to a specific domain.

The paper tackled the problem of identifying privileged legal documents by studying transfer learning with BERT and traditional models on three real-world datasets, showing that BERT outperforms logistic regression and transfer learning achieves decent performance in similar domains.

Protecting privileged communications and data from inadvertent disclosure is a paramount task in the US legal practice. Traditionally counsels rely on keyword searching and manual review to identify privileged documents in cases. As data volumes increase, this approach becomes less and less defensible in costs. Machine learning methods have been used in identifying privilege documents. Given the generalizable nature of privilege in legal cases, we hypothesize that transfer learning can capitalize knowledge learned from existing labeled data to identify privilege documents without requiring labeling new training data. In this paper, we study both traditional machine learning models and deep learning models based on BERT for privilege document classification tasks in legal document review, and we examine the effectiveness of transfer learning in privilege model on three real world datasets with privilege labels. Our results show that BERT model outperforms the industry standard logistic regression algorithm and transfer learning models can achieve decent performance on datasets in same or close domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes