Beagle: Automated Extraction and Interpretation of Visualizations from the Web
This work addresses the need for automated analysis of web visualizations to inform design and tool development, though it is incremental as it builds on existing extraction and classification methods.
The authors tackled the problem of understanding the prevalence and types of visualizations on the web by developing Beagle, a tool that automatically extracts and classifies SVG-based visualizations, achieving 86% accuracy across 24 types and analyzing over 41,000 visualizations to find that bar, line, scatter charts, and maps are most common, while pie charts are rare.
"How common is interactive visualization on the web?" "What is the most popular visualization design?" "How prevalent are pie charts really?" These questions intimate the role of interactive visualization in the real (online) world. In this paper, we present our approach (and findings) to answering these questions. First, we introduce Beagle, which mines the web for SVG-based visualizations and automatically classifies them by type (i.e., bar, pie, etc.). With Beagle, we extract over 41,000 visualizations across five different tools and repositories, and classify them with 86% accuracy, across 24 visualization types. Given this visualization collection, we study usage across tools. We find that most visualizations fall under four types: bar charts, line charts, scatter charts, and geographic maps. Though controversial, pie charts are relatively rare in practice. Our findings also indicate that users may prefer tools that emphasize a succinct set of visualization types, and provide diverse expert visualization examples.