CVLGIVJul 2, 2019

Semi-Bagging Based Deep Neural Architecture to Extract Text from High Entropy Images

arXiv:1907.01284v1
Originality Incremental advance
AI Analysis

This addresses the challenge of text detection in cluttered images for applications like e-commerce and augmented reality, representing an incremental improvement over prior CNN-based methods.

The paper tackles the problem of extracting text from images with high entropy and multiple objects by proposing an end-to-end strategy combining segmentation and an ensemble of text detectors, which outperforms existing methods in coverage and text extraction from such images.

Extracting texts of various size and shape from images containing multiple objects is an important problem in many contexts, especially, in connection to e-commerce, augmented reality assistance system in natural scene, etc. The existing works (based on only CNN) often perform sub-optimally when the image contains regions of high entropy having multiple objects. This paper presents an end-to-end text detection strategy combining a segmentation algorithm and an ensemble of multiple text detectors of different types to detect text in every individual image segments independently. The proposed strategy involves a super-pixel based image segmenter which splits an image into multiple regions. A convolutional deep neural architecture is developed which works on each of the segments and detects texts of multiple shapes, sizes, and structures. It outperforms the competing methods in terms of coverage in detecting texts in images especially the ones where the text of various types and sizes are compacted in a small region along with various other objects. Furthermore, the proposed text detection method along with a text recognizer outperforms the existing state-of-the-art approaches in extracting text from high entropy images. We validate the results on a dataset consisting of product images on an e-commerce website.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes