AI HC MLNov 27, 2024

ScaleViz: Scaling Visualization Recommendation Models on Large Data

Ghazi Shazan Ahmad, Shubham Agarwal, Subrata Mitra, Ryan Rossi, Manav Doshi, Vibhor Porwal, Syam Manoj Kumar Paila

arXiv:2411.18657v14.21 citationsh-index: 8PAKDD

Originality Incremental advance

AI Analysis

This work addresses a bottleneck for users of automated visualization tools by enabling faster insights from large real-world datasets, though it is incremental as it builds on existing models.

The paper tackles the computational inefficiency of automated visualization recommendation models on large datasets by proposing a reinforcement-learning framework that selects the most effective input statistics within a user-specified time budget, resulting in a 10x speedup with minimal error.

Automated visualization recommendations (vis-rec) help users to derive crucial insights from new datasets. Typically, such automated vis-rec models first calculate a large number of statistics from the datasets and then use machine-learning models to score or classify multiple visualizations choices to recommend the most effective ones, as per the statistics. However, state-of-the art models rely on very large number of expensive statistics and therefore using such models on large datasets become infeasible due to prohibitively large computational time, limiting the effectiveness of such techniques to most real world complex and large datasets. In this paper, we propose a novel reinforcement-learning (RL) based framework that takes a given vis-rec model and a time-budget from the user and identifies the best set of input statistics that would be most effective while generating the visual insights within a given time budget, using the given model. Using two state-of-the-art vis-rec models applied on three large real-world datasets, we show the effectiveness of our technique in significantly reducing time-to visualize with very small amount of introduced error. Our approach is about 10X times faster compared to the baseline approaches that introduce similar amounts of error.

View on arXiv PDF

Similar