Anveshak - A Groundtruth Generation Tool for Foreground Regions of Document Images
This tool addresses the need for efficient ground truth generation in document image analysis, though it is incremental as it builds on existing annotation methods with a focus on foreground regions.
The researchers tackled the problem of creating annotated ground truth data for document image analysis by developing Anveshak, a graphical user interface tool that allows users to group and label foreground pixels, producing an image and XML metadata file as output.
We propose a graphical user interface based groundtruth generation tool in this paper. Here, annotation of an input document image is done based on the foreground pixels. Foreground pixels are grouped together with user interaction to form labeling units. These units are then labeled by the user with the user defined labels. The output produced by the tool is an image with an XML file containing its metadata information. This annotated data can be further used in different applications of document image analysis.