CVLGNov 30, 2021

MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale

arXiv:2111.15592v124 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This tool addresses the challenge for historians in efficiently extracting and analyzing semantic information from extensive map sets, representing an incremental improvement in domain-specific computer vision applications.

The authors tackled the problem of analyzing large map collections by developing MapReader, a free, open-source Python library that transforms maps into searchable primary sources, enabling historians to process and interpret approximately 16,000 nineteenth-century Ordnance Survey map sheets with 30.5 million patches.

We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and v) create structured data about map content. We demonstrate how MapReader enables historians to interpret a collection of $\approx$16K nineteenth-century Ordnance Survey map sheets ($\approx$30.5M patches), foregrounding the challenge of translating visual markers into machine-readable data. We present a case study focusing on British rail infrastructure and buildings as depicted on these maps. We also show how the outputs from the MapReader pipeline can be linked to other, external datasets, which we use to evaluate as well as enrich and interpret the results. We release $\approx$62K manually annotated patches used here for training and evaluating the models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes