Format-Adapter: Improving Reasoning Capability of LLMs by Adapting Suitable Format
This addresses the high labeling costs and potential unsuitability of human-labeled formats for improving reasoning in LLMs, representing an incremental advance over prior multi-format methods.
The paper tackles the problem of reasoning inconsistencies in large language models by proposing Format-Adapter, a method that automatically generates and selects suitable reasoning formats for tasks instead of relying on human-labeled formats. It achieves an average 4.3% performance improvement on math and commonsense reasoning tasks compared to previous works.
Generating and voting multiple answers is an effective method to mitigate reasoning inconsistencies of large language models (LLMs). Prior works have shown that multiple reasoning formats outperform a single format when generating multiple answers. However, previous works using multiple formats rely on formats labeled by humans, which could be unsuitable for all tasks and have high labeling costs. To address this issue, we adapt suitable formats to the given tasks by generating and selecting formats. We first propose how to measure the reasoning error when generating multiple answers. Then, we introduce Format-Adapter, which utilizes LLMs to generate and select suitable reasoning formats by minimizing the error measurement we present. We conduct experiments on math and commonsense reasoning tasks, where Format-Adapter achieves a 4.3% performance improvement on average over previous works, demonstrating the effectiveness.