AILGApr 17, 2024

On the Empirical Complexity of Reasoning and Planning in LLMs

arXiv:2404.11041v225 citationsh-index: 4EMNLP
Originality Synthesis-oriented
AI Analysis

This provides incremental guidelines for using LLMs in reasoning tasks, benefiting practitioners in AI and machine learning.

The paper investigated why chain-of-thought and tree-of-thought techniques improve reasoning in LLMs, finding that task decomposition reduces sample complexity and tree structures outperform linear ones for hard tasks.

Chain-of-thought (CoT), tree-of-thought (ToT), and related techniques work surprisingly well in practice for some complex reasoning tasks with Large Language Models (LLMs), but why? This work seeks the underlying reasons by conducting experimental case studies and linking the performance benefits to well-established sample and computational complexity principles in machine learning. We experimented with 6 reasoning tasks, ranging from grade school math, air travel planning, ..., to Blocksworld. The results suggest that (i) both CoT and ToT benefit significantly from task decomposition, which breaks a complex reasoning task into a sequence of steps with low sample complexity and explicitly outlines the reasoning structure, and (ii) for computationally hard reasoning tasks, the more sophisticated tree structure of ToT outperforms the linear structure of CoT. These findings provide useful guidelines for the use of LLM in solving reasoning tasks in practice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes