HCAILGAug 14, 2018

VizML: A Machine Learning Approach to Visualization Recommendation

arXiv:1808.04819v1254 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of lowering the barrier to data visualization for analysts without technical expertise, representing an incremental improvement over existing ML-based systems.

The authors tackled the problem of making data visualization accessible by developing a machine learning approach that learns visualization design choices from a large corpus of dataset-visualization pairs, achieving high accuracy and performance comparable to humans in predicting consensus visualization types.

Data visualization should be accessible for all analysts with data, not just the few with technical expertise. Visualization recommender systems aim to lower the barrier to exploring basic visualizations by automatically generating results for analysts to search and select, rather than manually specify. Here, we demonstrate a novel machine learning-based approach to visualization recommendation that learns visualization design choices from a large corpus of datasets and associated visualizations. First, we identify five key design choices made by analysts while creating visualizations, such as selecting a visualization type and choosing to encode a column along the X- or Y-axis. We train models to predict these design choices using one million dataset-visualization pairs collected from a popular online visualization platform. Neural networks predict these design choices with high accuracy compared to baseline models. We report and interpret feature importances from one of these baseline models. To evaluate the generalizability and uncertainty of our approach, we benchmark with a crowdsourced test set, and show that the performance of our model is comparable to human performance when predicting consensus visualization type, and exceeds that of other ML-based systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes