Codifying Natural Langauge Tasks
This work addresses the challenge of automating complex domain-specific tasks for fields such as law and medicine, representing a novel method for a known bottleneck.
The paper tackles the problem of solving real-world natural language tasks like legal judgment and medical QA by transforming natural language into executable programs, achieving up to 161.1% relative improvement across 13 benchmarks.
We explore the applicability of text-to-code to solve real-world problems that are typically solved in natural language, such as legal judgment and medical QA. Unlike previous works, our approach leverages the explicit reasoning provided by program generation. We present ICRAG, a framework that transforms natural language into executable programs through iterative refinement using external knowledge from domain resources and GitHub. Across 13 benchmarks, ICRAG achieves up to 161.1\% relative improvement. We provide a detailed analysis of the generated code and the impact of external knowledge, and we discuss the limitations of applying text-to-code approaches to real-world natural language tasks.