Automated Proving of Shannon-Type Entropy Inequalities via Fine-Tuned Language Models and Guided Tree Search
This work addresses the combinatorial search problem of proving entropy inequalities for information theorists, offering a practical automated method that outperforms existing tools.
The paper investigates using small-scale language models (0.6B-1.7B parameters) fine-tuned on atomic proof steps with guided beam search to automate proving Shannon-type entropy inequalities. Their best model achieves 85% proof success on a test set of 60 inequalities, significantly outperforming GPT-5.5 (1.7%) and Psitip (33.3%).
Proving Shannon-type entropy inequalities is a fundamental task in information theory that often requires constructing non-trivial linear combinations of known constraints, which is a combinatorial search problem that scales poorly with the number of random variables. We investigate whether small-scale large language models (0.6B--1.7B parameters), fine-tuned on atomic proof steps and combined with guided beam search, can automate this process. On a held-out test set of 60 inequalities spanning n=10 to 15 variables, our 0.6B fine-tuned model achieves an 85\% proof success rate with tree search. GPT-5.5 solves 1.7\% samples under zero-shot prompting while Psitip solves 33.3\% samples. A systematic ablation study across training context length (4096 vs.\ 8192 tokens) and data distribution (n=9-skewed vs not skewed) reveals that a 4096-token not skewed training distribution yields the best performance, with extended context and skewed data providing no marginal benefit. We further identify two dominant failure modes -- format failures and step quality degradation -- and verify that the beam-scoring heuristic is essential via a controlled ablation (random scoring reduces success from 83\% to 23\%).