LG CV MLOct 5, 2020

OLALA: Object-Level Active Learning for Efficient Document Layout Annotation

Zejiang Shen, Jian Zhao, Melissa Dell, Yaoliang Yu, Weining Li

arXiv:2010.01762v339.5294 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the inefficiency in annotating intricate document layouts for researchers and practitioners in document analysis, though it is incremental as it builds on existing active learning and semi-supervised methods.

The paper tackles the problem of expensive manual annotation for document layout datasets by proposing OLALA, an object-level active learning framework that selects ambiguous regions for labeling and uses semi-automatic correction, resulting in significantly boosted model performance and improved annotation efficiency with the same labeling budget.

Document images often have intricate layout structures, with numerous content regions (e.g. texts, figures, tables) densely arranged on each page. This makes the manual annotation of layout datasets expensive and inefficient. These characteristics also challenge existing active learning methods, as image-level scoring and selection suffer from the overexposure of common objects.Inspired by recent progresses in semi-supervised learning and self-training, we propose an Object-Level Active Learning framework for efficient document layout Annotation, OLALA. In this framework, only regions with the most ambiguous object predictions within an image are selected for annotators to label, optimizing the use of the annotation budget. For unselected predictions, the semi-automatic correction algorithm is proposed to identify certain errors based on prior knowledge of layout structures and rectifies them with minor supervision. Additionally, we carefully design a perturbation-based object scoring function for document images. It governs the object selection process via evaluating prediction ambiguities, and considers both the positions and categories of predicted layout objects. Extensive experiments show that OLALA can significantly boost model performance and improve annotation efficiency, given the same labeling budget. Code for this paper can be accessed via https://github.com/lolipopshock/detectron2_al.

View on arXiv PDF Code

Similar