CVIRMLApr 21, 2017

Scatteract: Automated extraction of data from scatter plots

arXiv:1704.06687v176 citations
Originality Incremental advance
AI Analysis

This addresses the need for data reuse and analysis from visualizations, particularly for researchers and analysts, though it is incremental as it builds on prior work for other chart types.

The authors tackled the problem of automatically extracting numerical data from scatter plot images, achieving successful extraction on 89% of plots in their test set.

Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical values of data points from images of scatter plots. We use deep learning techniques to identify the key components of the chart, and optical character recognition together with robust regression to map from pixels to the coordinate system of the chart. We focus on scatter plots with linear scales, which already have several interesting challenges. Previous work has done fully automatic extraction for other types of charts, but to our knowledge this is the first approach that is fully automatic for scatter plots. Our method performs well, achieving successful data extraction on 89% of the plots in our test set.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes