LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
This addresses the need for scalable and effective reasoning data in AI, though it is incremental as it builds on existing methods for data synthesis.
The paper tackles the problem of generating high-quality data for complex logical reasoning by synthesizing text reasoning processes from LeetCode-style algorithm problems and their program solutions, resulting in a 540K dataset that significantly improves multiple models on benchmarks like BBH, LogicBench, DROP, AR-LSAT, and GSM8K.
In this paper, we propose a new data synthesis method called \textbf{LogicPro}, which leverages LeetCode-style algorithm \underline{Pro}blems and their corresponding \underline{Pro}gram solutions to synthesize Complex \underline{Logic}al Reasoning data in text format. First, we synthesize complex reasoning problems through source algorithm problems and test cases. Then, standard answers and intermediate variable outputs are obtained for each problem based on standard python solutions and test cases. Finally, with the guidance of code intermediate variables, we synthesize the text reasoning process for each reasoning problems. Through this method, we can synthesize data that is difficult, scalable, effective, and comes with golden standard answers and high-quality reasoning processes. As a result, with our 540K synthesized dataset constructed solely from 2,360 algorithm problems, our approach \footnote{Code and data are publicly available at https://github.com/jiangjin1999/LogicPro} achieves significant improvements in multiple models for the datasets \textit{BBH$^{27}$}, \textit{LogicBench}, \textit{DROP}, \textit{AR-LSAT}, and \textit{GSM8K}, etc. outperforming a wide range of existing reasoning datasets.