Research

Reproducible benchmarks and writeups on AI research tooling over the arXiv corpus.

Benchmark

Corpus MCP vs deep-research

A blind, reproducible head-to-head of a curated-corpus MCP against a general web deep-research pipeline — including where the built-in beat us.