CVAICLLGDec 5, 2024

Text Change Detection in Multilingual Documents Using Image Comparison

arXiv:2412.04137v12 citationsh-index: 4WACV
Originality Incremental advance
AI Analysis

This addresses the challenge of multilingual document comparison for users in fields like publishing or archiving, though it is incremental as it builds on existing change detection and segmentation techniques.

The paper tackles the problem of detecting text changes in multilingual documents by proposing a text change detection method using image comparison, which avoids OCR limitations and achieves state-of-the-art performance on benchmark datasets.

Document comparison typically relies on optical character recognition (OCR) as its core technology. However, OCR requires the selection of appropriate language models for each document and the performance of multilingual or hybrid models remains limited. To overcome these challenges, we propose text change detection (TCD) using an image comparison model tailored for multilingual documents. Unlike OCR-based approaches, our method employs word-level text image-to-image comparison to detect changes. Our model generates bidirectional change segmentation maps between the source and target documents. To enhance performance without requiring explicit text alignment or scaling preprocessing, we employ correlations among multi-scale attention features. We also construct a benchmark dataset comprising actual printed and scanned word pairs in various languages to evaluate our model. We validate our approach using our benchmark dataset and public benchmarks Distorted Document Images and the LRDE Document Binarization Dataset. We compare our model against state-of-the-art semantic segmentation and change detection models, as well as to conventional OCR-based models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes