CVAug 28, 2023

Ensemble of Anchor-Free Models for Robust Bangla Document Layout Segmentation

arXiv:2308.14397v21 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This work addresses layout segmentation for Bangla documents, which is an incremental improvement in a domain-specific task.

The paper tackles Bangla document layout segmentation by developing an ensemble of YOLOv8 models with techniques like image augmentation and Bayesian optimization, achieving improved cross-validation scores through deliberate image quality reduction for robustness.

In this research paper, we introduce a novel approach designed for the purpose of segmenting the layout of Bangla documents. Our methodology involves the utilization of a sophisticated ensemble of YOLOv8 models, which were trained for the DL Sprint 2.0 - BUET CSE Fest 2023 Competition focused on Bangla document layout segmentation. Our primary emphasis lies in enhancing various aspects of the task, including techniques such as image augmentation, model architecture, and the incorporation of model ensembles. We deliberately reduce the quality of a subset of document images to enhance the resilience of model training, thereby resulting in an improvement in our cross-validation score. By employing Bayesian optimization, we determine the optimal confidence and Intersection over Union (IoU) thresholds for our model ensemble. Through our approach, we successfully demonstrate the effectiveness of anchor-free models in achieving robust layout segmentation in Bangla documents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes