CLMay 28

COMPOSE: Composing Future Theorems from Citations and Formal Structure

arXiv:2605.3033375.6
Predicted impact top 81% in CL · last 90 daysOriginality Highly original
AI Analysis

This work introduces a new task (grounded future mathematical generation) and a dual-graph framework that combines scientific and formal structure, benefiting researchers in automated theorem discovery and mathematical AI.

COMPOSE generates plausible future theorem-like claims by conditioning a language model on both citation context and formal theorem dependencies, outperforming baselines on retrieval and LLM-judge evaluation with a dataset of 108K examples.

A plausible future mathematical claim must satisfy two constraints: it should follow the direction of prior work and respect the formal dependencies that constrain what can validly follow. Existing approaches typically model only one of these sources, producing claims that are either weakly grounded or insufficiently motivated. We introduce grounded future mathematical generation, where the goal is to generate a plausible future theorem-like claim for an anchor paper using two complementary sources of context: its scientific citation graph and aligned formal theorem dependency graph. To address this setting, we propose COMPOSE, a dual-graph framework that conditions a language model on both scientific citation context and formal theorem structure. To support this setting, we construct a dataset of 108K paired scientific-formal graph examples from arXiv and Mathlib, together with a benchmark of 47K future papers from 2024--2025. Experiments show that COMPOSE outperforms strong baselines on retrieval to real future papers and achieves the best overall performance under LLM-judge evaluation, producing more grounded and mathematically richer outputs. These results show that future mathematical generation benefits from combining scientific context with formal structure. Project page is available at https://david-busbib.github.io/COMPOSE-page/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes