CLCVAug 8, 2024

Arctic-TILT. Business Document Understanding at Sub-Billion Scale

arXiv:2408.04632v14 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the need for efficient and cost-effective document understanding in enterprise environments, though it is incremental in scaling down existing methods.

The paper tackles the problem of answering questions from PDF or scan content using large language models, achieving accuracy comparable to models 1000 times larger while being fine-tunable and deployable on a single 24GB GPU for processing up to 400k tokens.

The vast portion of workloads employing LLMs involves answering questions grounded on PDF or scan content. We introduce the Arctic-TILT achieving accuracy on par with models 1000$\times$ its size on these use cases. It can be fine-tuned and deployed on a single 24GB GPU, lowering operational costs while processing Visually Rich Documents with up to 400k tokens. The model establishes state-of-the-art results on seven diverse Document Understanding benchmarks, as well as provides reliable confidence scores and quick inference, which are essential for processing files in large-scale or time-sensitive enterprise environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes