CVNov 3, 2017

Real-Time Document Image Classification using Deep CNN and Extreme Learning Machines

arXiv:1711.05862v180 citations
Originality Incremental advance
AI Analysis

This work addresses the need for real-time, efficient document classification in production environments, offering a novel hybrid method that significantly improves speed and accuracy.

The paper tackles the problem of slow training and testing times in deep learning-based document image classification by proposing a two-stage approach combining a deep CNN feature extractor with Extreme Learning Machines (ELMs), achieving 83.24% accuracy on the Tobacco-3482 dataset and reducing training time to 1.176 seconds.

This paper presents an approach for real-time training and testing for document image classification. In production environments, it is crucial to perform accurate and (time-)efficient training. Existing deep learning approaches for classifying documents do not meet these requirements, as they require much time for training and fine-tuning the deep architectures. Motivated from Computer Vision, we propose a two-stage approach. The first stage trains a deep network that works as feature extractor and in the second stage, Extreme Learning Machines (ELMs) are used for classification. The proposed approach outperforms all previously reported structural and deep learning based methods with a final accuracy of 83.24% on Tobacco-3482 dataset, leading to a relative error reduction of 25% when compared to a previous Convolutional Neural Network (CNN) based approach (DeepDocClassifier). More importantly, the training time of the ELM is only 1.176 seconds and the overall prediction time for 2,482 images is 3.066 seconds. As such, this novel approach makes deep learning-based document classification suitable for large-scale real-time applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes