paper.json: A Coordination Convention for LLM-Agent-Actionable Papers

arXiv:2605.1619411.3Has Code

Predicted impact top 12% in DL · last 90 daysOriginality Synthesis-oriented

AI Analysis

For researchers and LLM agent developers, this addresses the problem of unreliable agentic reading of academic papers, but the contribution is incremental—a practical convention rather than a new paradigm.

The paper proposes a lightweight JSON convention (paper.json) to accompany academic PDFs, enabling LLM agents to reliably extract sub-claims, scope limitations, and executable figure commands. The convention includes stable claim IDs, explicit does-not-claim lists, per-figure shell commands, and stable definition IDs, with a claim that hand-written compliance is achievable in under an hour.

LLM agents routinely serve as first (and sometimes only) readers of academic papers, skimming for sub-claims, extracting reproducibility steps, and generalizing scope. Standard prose papers produce recurring failures in this role: sub-claims that cannot be cited at sub-paper granularity, scope overextension beyond what the paper tests, and figure commands buried in codebases rather than the paper itself. We propose `paper.json`, a companion JSON file that travels with the PDF and addresses each failure with a lightweight convention: stable claim IDs (C1), an explicit does-not-claim list (C2), exact per-figure shell commands (C3), and stable definition IDs (C5). A fifth convention (C4) holds that minimum viable compliance, hand-written JSON alongside the PDF, is achievable in under an hour for a finished paper without touching the human-readable output. C1, C2, C3, and C5 are open invitations: an agent that reads a compliant paper and acts on it produces evidence for or against them. This paper is itself compliant: `uv run validator.py paper.json --against paper.typ` passes. Repo: https://github.com/arquicanedo/paper-json

View on arXiv PDF Code

Similar