CL AI HCJan 4, 2023

Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes

Justin Reppert, Ben Rachbach, Charlie George, Luke Stebbing, Jungwon Byun, Maggie Appleton, Andreas Stuhlmüller

arXiv:2301.01751v224 citationsh-index: 19Has Code

AI Analysis

This addresses the problem of maintaining interpretability and safety in ML systems as they scale to complex tasks, though it is incremental as it builds on existing compositional methods with workflow support.

The paper tackles the challenge of improving compositional reasoning in language models for science Q&A by introducing iterated decomposition, a human-in-the-loop workflow that refines failing components, resulting in accuracy improvements from 25% to 65%, 53% to 70%, and 38% to 69% on three real-world tasks.

Language models (LMs) can perform complex reasoning either end-to-end, with hidden latent state, or compositionally, with transparent intermediate state. Composition offers benefits for interpretability and safety, but may need workflow support and infrastructure to remain competitive. We describe iterated decomposition, a human-in-the-loop workflow for developing and refining compositional LM programs. We improve the performance of compositions by zooming in on failing components and refining them through decomposition, additional context, chain of thought, etc. To support this workflow, we develop ICE, an open-source tool for visualizing the execution traces of LM programs. We apply iterated decomposition to three real-world tasks and improve the accuracy of LM programs over less compositional baselines: describing the placebo used in a randomized controlled trial (25% to 65%), evaluating participant adherence to a medical intervention (53% to 70%), and answering NLP questions on the Qasper dataset (38% to 69%). These applications serve as case studies for a workflow that, if automated, could keep ML systems interpretable and safe even as they scale to increasingly complex tasks.

View on arXiv PDF

Similar