CL AI NEFeb 13, 2022

Transformer-based Approaches for Legal Text Processing

Ha-Thanh Nguyen, Minh-Phuong Nguyen, Thi-Hai-Yen Vuong, Minh-Quan Bui, Minh-Chau Nguyen, Tran-Binh Dang, Vu Tran, Le-Minh Nguyen, Ken Satoh

arXiv:2202.06397v10.616 citationsh-index: 30

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of processing legal documents with limited data, though it is incremental as it applies existing Transformer methods to a specific domain.

The paper tackled automated legal text processing using Transformer-based models for the COLIEE 2021 competition, achieving state-of-the-art results in Task 5 with the NFSP model.

In this paper, we introduce our approaches using Transformer-based models for different problems of the COLIEE 2021 automatic legal text processing competition. Automated processing of legal documents is a challenging task because of the characteristics of legal documents as well as the limitation of the amount of data. With our detailed experiments, we found that Transformer-based pretrained language models can perform well with automated legal text processing problems with appropriate approaches. We describe in detail the processing steps for each task such as problem formulation, data processing and augmentation, pretraining, finetuning. In addition, we introduce to the community two pretrained models that take advantage of parallel translations in legal domain, NFSP and NMSP. In which, NFSP achieves the state-of-the-art result in Task 5 of the competition. Although the paper focuses on technical reporting, the novelty of its approaches can also be an useful reference in automated legal document processing using Transformer-based models.

View on arXiv PDF

Similar