No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts
This addresses a critical bottleneck in the world's largest judiciary system, though it is an incremental improvement over existing methods.
The paper tackled the slow processing of long legal texts in the Brazilian judiciary by introducing uBERT, a hybrid Transformer-RNN model that processes full texts efficiently, achieving superior performance to BERT+LSTM with overlapping input and significantly faster speeds than ULMFiT.
In a context where the Brazilian judiciary system, the largest in the world, faces a crisis due to the slow processing of millions of cases, it becomes imperative to develop efficient methods for analyzing legal texts. We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts. Our approach processes the full text regardless of its length while maintaining reasonable computational overhead. Our experiments demonstrate that uBERT achieves superior performance compared to BERT+LSTM when overlapping input is used and is significantly faster than ULMFiT for processing long legal documents.