CY CL LGJan 4

AppellateGen: A Benchmark for Appellate Legal Judgment Generation

Hongkun Yang, Lionel Z. Wang, Wei Fan, Yiran Hu, Lixu Wang, Chenyu Liu, Shenghong Fu, Haoyang Li, Xin Xu, Jiexin Zheng, Wei Dong

arXiv:2601.01331v12.31 citations

Originality Incremental advance

AI Analysis

This work addresses a gap in legal intelligence for appellate review, but it is incremental as it builds on existing legal judgment generation research by extending it to a more complex domain.

The authors tackled the problem of legal judgment generation for appellate cases, which had been neglected in prior research focused on first-instance trials, by introducing AppellateGen, a benchmark with 7,351 case pairs, and found that while their proposed SLMAS method improved logical consistency, appellate reasoning remained a substantial challenge for current LLMs.

Legal judgment generation is a critical task in legal intelligence. However, existing research in legal judgment generation has predominantly focused on first-instance trials, relying on static fact-to-verdict mappings while neglecting the dialectical nature of appellate (second-instance) review. To address this, we introduce AppellateGen, a benchmark for second-instance legal judgment generation comprising 7,351 case pairs. The task requires models to draft legally binding judgments by reasoning over the initial verdict and evidentiary updates, thereby modeling the causal dependency between trial stages. We further propose a judicial Standard Operating Procedure (SOP)-based Legal Multi-Agent System (SLMAS) to simulate judicial workflows, which decomposes the generation process into discrete stages of issue identification, retrieval, and drafting. Experimental results indicate that while SLMAS improves logical consistency, the complexity of appellate reasoning remains a substantial challenge for current LLMs. The dataset and code are publicly available at: https://anonymous.4open.science/r/AppellateGen-5763.

View on arXiv PDF

Similar