Survey on Question Answering over Visually Rich Documents: Methods, Challenges, and Trends
This survey addresses the lack of consensus in visually-rich document understanding, offering an overview for researchers and practitioners in AI and document processing.
The paper provides a comprehensive survey of state-of-the-art methods for question answering over visually rich documents, highlighting their strengths, limitations, and key challenges in the field.
The field of visually-rich document understanding, which involves interacting with visually-rich documents (whether scanned or born-digital), is rapidly evolving and still lacks consensus on several key aspects of the processing pipeline. In this work, we provide a comprehensive overview of state-of-the-art approaches, emphasizing their strengths and limitations, pointing out the main challenges in the field, and proposing promising research directions.