CL AISep 8, 2023

NESTLE: a No-Code Tool for Statistical Analysis of Legal Corpus

Kyoungyeon Cho, Seungkum Han, Young Rok Choi, Wonseok Hwang

arXiv:2309.04146v218.8103 citationsh-index: 2Has Code

Originality Incremental advance

AI Analysis

This provides a practical solution for legal professionals and researchers to analyze large legal datasets without coding skills, though it is incremental as it builds on existing LLM and IE technologies.

The authors tackled the problem of enabling statistical analysis of legal corpora without programming by introducing NESTLE, a no-code tool that uses an LLM and custom IE system to extract customizable information, achieving GPT-4 comparable performance with minimal labeled data (4 human-labeled and 192 LLM-labeled examples).

The statistical analysis of large scale legal corpus can provide valuable legal insights. For such analysis one needs to (1) select a subset of the corpus using document retrieval tools, (2) structure text using information extraction (IE) systems, and (3) visualize the data for the statistical analysis. Each process demands either specialized tools or programming skills whereas no comprehensive unified "no-code" tools have been available. Here we provide NESTLE, a no-code tool for large-scale statistical analysis of legal corpus. Powered by a Large Language Model (LLM) and the internal custom end-to-end IE system, NESTLE can extract any type of information that has not been predefined in the IE system opening up the possibility of unlimited customizable statistical analysis of the corpus without writing a single line of code. We validate our system on 15 Korean precedent IE tasks and 3 legal text classification tasks from LexGLUE. The comprehensive experiments reveal NESTLE can achieve GPT-4 comparable performance by training the internal IE module with 4 human-labeled, and 192 LLM-labeled examples.

View on arXiv PDF Code

Similar