CL AI HC IR LGJan 16, 2025

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Zekun Xi, Wenbiao Yin, Jizhan Fang, Jialong Wu, Runnan Fang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

arXiv:2501.09751v522.224 citationsh-index: 32Has CodeEMNLP

Originality Incremental advance

AI Analysis

This work addresses the challenge of generating high-quality, information-rich long-form articles for applications in automated content creation, though it appears incremental by building on retrieval-augmented generation methods.

The paper tackles the problem of shallow and repetitive content in machine writing by proposing OmniThink, a slow-thinking framework that improves knowledge density in generated articles without sacrificing coherence and depth.

Machine writing with large language models often relies on retrieval-augmented generation. However, these approaches remain confined within the boundaries of the model's predefined scope, limiting the generation of content with rich information. Specifically, vanilla-retrieved information tends to lack depth, novelty, and suffers from redundancy, which negatively impacts the quality of generated articles, leading to shallow, unoriginal, and repetitive outputs. To address these issues, we propose OmniThink, a slow-thinking machine writing framework that emulates the human-like process of iterative expansion and reflection. The core idea behind OmniThink is to simulate the cognitive behavior of learners as they slowly deepen their knowledge of the topics. Experimental results demonstrate that OmniThink improves the knowledge density of generated articles without compromising metrics such as coherence and depth. Human evaluations and expert feedback further highlight the potential of OmniThink to address real-world challenges in the generation of long-form articles. Code is available at https://github.com/zjunlp/OmniThink.

View on arXiv PDF Code

Similar