JudgeMeNot: Personalizing Large Language Models to Emulate Judicial Reasoning in Hebrew
For legal AI applications, this work provides a method to personalize LLMs to individual judges' reasoning styles, which is a step toward practical deployment but remains incremental due to reliance on existing techniques.
The paper tackles the problem of personalizing large language models for individual decision-makers in low-resource settings, using a synthetic-organic supervision pipeline to fine-tune models for individual judges. The approach significantly outperforms baselines across lexical, stylistic, and semantic similarity, with outputs indistinguishable from human judges.
Despite significant advances in large language models, personalizing them for individual decision-makers remains an open problem. Here, we introduce a synthetic-organic supervision pipeline that transforms raw judicial decisions into instruction-tuning data, enabling parameter-efficient fine-tuning of personalized models for individual judges in low-resource settings. We compare our approach to state-of-the-art personalization techniques across three different tasks and settings. The results show that Causal Language Modeling followed by synthetically generated instruction-tuning significantly outperforms all other baselines, providing significant improvements across lexical, stylistic, and semantic similarity. Notably, our model-generated outputs are indistinguishable from the reasoning of human judges, highlighting the viability of efficient personalization, even in low-resource settings.