CLLGOct 18, 2024

Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning

arXiv:2410.14157v3109 citationsh-index: 19Has CodeICLR
Originality Highly original
AI Analysis

This addresses limitations in AI for sophisticated language understanding and problem-solving, representing a novel method rather than an incremental improvement.

The paper tackles the problem of complex reasoning and long-term planning tasks where autoregressive language models struggle, by introducing discrete diffusion models with Multi-Granularity Diffusion Modeling (MGDM) that prioritizes difficult subgoals, achieving 91.5% and 100% accuracy on Countdown and Sudoku compared to 45.8% and 20.7% for autoregressive models.

Autoregressive language models, despite their impressive capabilities, struggle with complex reasoning and long-term planning tasks. We introduce discrete diffusion models as a novel solution to these challenges. Through the lens of subgoal imbalance, we demonstrate how diffusion models effectively learn difficult subgoals that elude autoregressive approaches. We propose Multi-Granularity Diffusion Modeling (MGDM), which prioritizes subgoals based on difficulty during learning. On complex tasks like Countdown, Sudoku, and Boolean Satisfiability Problems, MGDM significantly outperforms autoregressive models without using search techniques. For instance, MGDM achieves 91.5\% and 100\% accuracy on Countdown and Sudoku, respectively, compared to 45.8\% and 20.7\% for autoregressive models. Our work highlights the potential of diffusion-based approaches in advancing AI capabilities for sophisticated language understanding and problem-solving tasks. All associated codes are available at \href{https://github.com/HKUNLP/diffusion-vs-ar}{https://github.com/HKUNLP/diffusion-vs-ar}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes