LGMar 8

Deterministic Fuzzy Triage for Legal Compliance Classification and Evidence Retrieval

arXiv:2603.07390v1

Predicted impact top 98% in LG · last 90 daysOriginality Incremental advance

AI Analysis

This work provides a more transparent and reproducible machine learning approach for legal teams to triage contractual evidence, addressing concerns about model opacity and non-determinism in sensitive legal contexts.

This paper proposes a deterministic dual encoder system with fuzzy triage bands for legal compliance classification and evidence retrieval. It achieves NDCG@5 0.38-0.42 and NDCG@10 0.45-0.50 on ACORD-style retrieval, and AUC 0.98-0.99 and F1 0.22-0.30 on CUAD-derived binary compliance, outperforming baselines in a highly imbalanced setting.

Legal teams increasingly use machine learning to triage large volumes of contractual evidence, but many models are opaque, non-deterministic, and difficult to align with frameworks such as HIPAA or NERC-CIP. We study a simple, reproducible alternative based on deterministic dual encoders and transparent fuzzy triage bands. We train a RoBERTa-base dual encoder with a 512-dimensional projection and cosine similarity on the ACORD benchmark for graded clause retrieval, then fine-tune it on a CUAD-derived binary compliance dataset. Across five random seeds (40-44) on a single NVIDIA A100 GPU, the model achieves ACORD-style retrieval performance of NDCG@5 0.38-0.42, NDCG@10 0.45-0.50, and 4-star Precision@5 about 0.37 on the test split. On CUAD-derived binary labels, it achieves AUC 0.98-0.99 and F1 0.22-0.30 depending on positive-class weighting, outperforming majority and random baselines in a highly imbalanced setting with a positive rate of about 0.6%. We then map scalar compliance scores into three regions: auto-noncompliant, auto-compliant, and human-review. Thresholds are tuned on validation data to maximize automatic decision coverage subject to an empirical error-rate constraint of at most 2% over auto-decided examples. The result is a seed-stable system summarized by a small number of scalar parameters. We argue that deterministic encoders, calibrated fuzzy bands, and explicit error constraints provide a practical middle ground between hand-crafted rules and opaque large language models, supporting explainable evidence triage, reproducible audit trails, and concrete mappings to legal review concepts.

View on arXiv PDF

Similar