CLAIMay 5

PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination

arXiv:2605.0357178.0Has Code
AI Analysis

For researchers and practitioners in patent examination and legal AI, this benchmark provides a more realistic evaluation of LLMs in interactive legal reasoning, though it is incremental as it applies existing LLM evaluation methods to a new domain.

PatRe introduces the first benchmark modeling the full patent examination lifecycle, including Office Action generation and applicant rebuttal, across 480 real-world cases. Experiments reveal performance differences between proprietary and open-source LLMs and task asymmetries between examiner analysis and applicant rebuttal.

Patent examination is a complex, multi-stage process requiring both technical expertise and legal reasoning, increasingly challenged by rising application volumes. Prior benchmarks predominantly view patent examination as discriminative classification or static extraction, failing to capture its inherently interactive and iterative nature, similar to the peer review and rebuttal process in academic publishing. In this paper, we introduce PatRe, the first benchmark that models the full patent examination lifecycle, including Office Action generation and applicant rebuttal. PatRe comprises 480 real-world cases and supports both oracle and retrieval-simulated evaluation settings. Our benchmark reframes patent examination as a dynamic, multi-turn process of justification and response. Extensive experiments across various LLMs reveal critical insights into model performance, including differences between proprietary and open-source models, as well as task asymmetries between examiner analysis and applicant-side rebuttal. These findings highlight both the potential and current limitations of LLMs in modeling complex, real-world legal reasoning and technical novelty judgment in patent examination. We release our code and dataset to facilitate future research on patent examination modeling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes