CLMay 16

RTI-Bench: A Structured Dataset for Indian Right-to-Information Decision Analysis

arXiv:2605.1684376.1Has Code

Predicted impact top 80% in CL · last 90 daysOriginality Synthesis-oriented

AI Analysis

For researchers and practitioners in legal NLP and Indian administrative law, this dataset enables automated analysis and prediction of RTI appeal outcomes, addressing a practical access-to-justice problem.

The paper introduces RTI-Bench, the first structured dataset for Indian Right-to-Information (RTI) administrative decisions, comprising 1,516 cases with outcome labels and legal components. A zero-shot Mistral 7B model achieves 57.3% accuracy and 37.0% macro-F1 on outcome prediction, significantly above the majority-class baseline of 14.3% macro-F1.

India's Right to Information Act, 2005 gives every citizen the right to demand information from public authorities, yet in practice most people cannot make sense of the dense administrative language used in Central Information Commission (CIC) decisions, let alone predict whether an appeal is worth filing. This paper introduces RTI-Bench, a structured dataset of CIC decisions with outcome labels, exemption citations, IRAC-style reasoning components, and procedural timelines. To the best of our knowledge it is the first publicly released structured dataset for Indian RTI administrative decisions. The dataset draws from two sources: 1,218 cases from a publicly available instruction-response corpus (with structured fields added through rule-based extraction), and 298 CIC decision PDFs collected directly from the Commission portal, spanning five commissioners and three document format generations from 2023 to 2026. Label coverage reaches 89% on the instruction-response corpus. For the PDF subset of 239 primary decisions, coverage is 51% in this first release. A random sample of 50 labelled cases was manually reviewed, yielding a label precision of 95.3%. A zero-shot Mistral 7B baseline on 100 cases gives 57.3% accuracy and 37.0% macro-F1 on outcome prediction, well above the majority-class baseline of 14.3% macro-F1. RTI-Bench is available at https://huggingface.co/datasets/joyboseroy/rti-bench

View on arXiv PDF

Similar