CVGROct 15, 2025

Automated document processing system for government agencies using DBNET++ and BART models

arXiv:2510.13303v1Int j circuit comput netw
Originality Synthesis-oriented
AI Analysis

This addresses document processing for government agencies, but it is incremental as it combines existing methods for a specific application.

The paper tackled automated document classification from images by detecting text with DBNET++ and classifying it with BART, achieving a text detection accuracy of 92.88% on the Total-Text dataset.

An automatic document classification system is presented that detects textual content in images and classifies documents into four predefined categories (Invoice, Report, Letter, and Form). The system supports both offline images (e.g., files on flash drives, HDDs, microSD) and real-time capture via connected cameras, and is designed to mitigate practical challenges such as variable illumination, arbitrary orientation, curved or partially occluded text, low resolution, and distant text. The pipeline comprises four stages: image capture and preprocessing, text detection [1] using a DBNet++ (Differentiable Binarization Network Plus) detector, and text classification [2] using a BART (Bidirectional and Auto-Regressive Transformers) classifier, all integrated within a user interface implemented in Python with PyQt5. The achieved results by the system for text detection in images were good at about 92.88% through 10 hours on Total-Text dataset that involve high resolution images simulate a various and very difficult challenges. The results indicate the proposed approach is effective for practical, mixed-source document categorization in unconstrained imaging scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes