CLJan 4, 2025

Survey on Question Answering over Visually Rich Documents: Methods, Challenges, and Trends

arXiv:2501.02235v25 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This survey addresses the lack of consensus in visually-rich document understanding, offering an overview for researchers and practitioners in AI and document processing.

The paper provides a comprehensive survey of state-of-the-art methods for question answering over visually rich documents, highlighting their strengths, limitations, and key challenges in the field.

The field of visually-rich document understanding, which involves interacting with visually-rich documents (whether scanned or born-digital), is rapidly evolving and still lacks consensus on several key aspects of the processing pipeline. In this work, we provide a comprehensive overview of state-of-the-art approaches, emphasizing their strengths and limitations, pointing out the main challenges in the field, and proposing promising research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes