CLAIOct 15, 2025

Document Intelligence in the Era of Large Language Models: A Survey

arXiv:2510.13366v15 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

It synthesizes the state-of-the-art in DAI for researchers and practitioners, but is incremental as a survey paper.

This survey examines how large language models (LLMs) have transformed Document AI (DAI), highlighting advancements in understanding and generation, and provides a structured analysis of current research and future directions.

Document AI (DAI) has emerged as a vital application area, and is significantly transformed by the advent of large language models (LLMs). While earlier approaches relied on encoder-decoder architectures, decoder-only LLMs have revolutionized DAI, bringing remarkable advancements in understanding and generation. This survey provides a comprehensive overview of DAI's evolution, highlighting current research attempts and future prospects of LLMs in this field. We explore key advancements and challenges in multimodal, multilingual, and retrieval-augmented DAI, while also suggesting future research directions, including agent-based approaches and document-specific foundation models. This paper aims to provide a structured analysis of the state-of-the-art in DAI and its implications for both academic and practical applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes