LGDLIRJun 6, 2013

Table of Content detection using Machine Learning

arXiv:1306.4631v13 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for efficient digitization and structuring of multipage documents like books, but appears incremental as it builds on existing TOC detection efforts.

The paper tackles the problem of detecting Table of Content (TOC) pages in multipage documents to improve navigation and information retrieval, introducing a new machine learning method with different features for this task.

Table of content (TOC) detection has drawn attention now a day because it plays an important role in digitization of multipage document. Generally book document is multipage document. So it becomes necessary to detect Table of Content page for easy navigation of multipage document and also to make information retrieval faster for desirable data from the multipage document. All the Table of content pages follow the different layout, different way of presenting the contents of the document like chapter, section, subsection etc. This paper introduces a new method to detect Table of content using machine learning technique with different features. With the main aim to detect Table of Content pages is to structure the document according to their contents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes