CL AI LGOct 5, 2023

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Nicholas Crispino, Kyle Montgomery, Fankun Zeng, Dawn Song, Chenguang Wang

arXiv:2310.03710v29.843 citationsh-index: 4Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of improving zero-shot reasoning for AI researchers and practitioners, representing a strong incremental advance in method design.

The paper tackles the problem of enhancing zero-shot reasoning abilities in large language models across general language understanding tasks by introducing an autonomous agent to instruct the reasoning process, resulting in state-of-the-art performance on 20 out of 29 datasets with improvements such as 13.3% for Vicuna-13b and an average 10.5% increase over zero-shot chain of thought.

We introduce a method to improve the zero-shot reasoning abilities of large language models on general language understanding tasks. Specifically, we build an autonomous agent to instruct the reasoning process of large language models. We show this approach further unleashes the zero-shot reasoning abilities of large language models to more tasks. We study the performance of our method on a wide set of datasets spanning generation, classification, and reasoning. We show that our method generalizes to most tasks and obtains state-of-the-art zero-shot performance on 20 of the 29 datasets that we evaluate. For instance, our method boosts the performance of state-of-the-art large language models by a large margin, including Vicuna-13b (13.3%), Llama-2-70b-chat (23.2%), and GPT-3.5 Turbo (17.0%). Compared to zero-shot chain of thought, our improvement in reasoning is striking, with an average increase of 10.5%. With our method, Llama-2-70b-chat outperforms zero-shot GPT-3.5 Turbo by 10.2%.

View on arXiv PDF Code

Similar