CL AI LGApr 17, 2023

Classification of US Supreme Court Cases using BERT-Based Techniques

Shubham Vatsal, Adam Meyers, John E. Ortega

arXiv:2304.08649v321.3134 citationsh-index: 24Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of applying BERT models to long legal documents, providing incremental improvements in classification accuracy for legal NLP tasks.

The paper tackled the problem of classifying long US Supreme Court documents using BERT-based techniques, achieving an accuracy of 80% on a 15-category task and 60% on a 279-category task, which improved previous state-of-the-art results by 8% and 28% respectively.

Models based on bidirectional encoder representations from transformers (BERT) produce state of the art (SOTA) results on many natural language processing (NLP) tasks such as named entity recognition (NER), part-of-speech (POS) tagging etc. An interesting phenomenon occurs when classifying long documents such as those from the US supreme court where BERT-based models can be considered difficult to use on a first-pass or out-of-the-box basis. In this paper, we experiment with several BERT-based classification techniques for US supreme court decisions or supreme court database (SCDB) and compare them with the previous SOTA results. We then compare our results specifically with SOTA models for long documents. We compare our results for two classification tasks: (1) a broad classification task with 15 categories and (2) a fine-grained classification task with 279 categories. Our best result produces an accuracy of 80\% on the 15 broad categories and 60\% on the fine-grained 279 categories which marks an improvement of 8\% and 28\% respectively from previously reported SOTA results.

View on arXiv PDF Code

Similar