Progressive Analytics: A Computation Paradigm for Exploratory Data Analysis
This addresses the need for faster data exploration for analysts by reducing latency, though it is incremental as it builds on existing progressive computation ideas.
The paper tackles the problem of slow feedback loops in exploratory data analysis with large or complex data by introducing Progressive Analytics, a computation paradigm that guarantees low latency through progressive computation at the programming language level, demonstrated with a prototype called ProgressiVis.
Exploring data requires a fast feedback loop from the analyst to the system, with a latency below about 10 seconds because of human cognitive limitations. When data becomes large or analysis becomes complex, sequential computations can no longer be completed in a few seconds and data exploration is severely hampered. This article describes a novel computation paradigm called Progressive Computation for Data Analysis or more concisely Progressive Analytics, that brings at the programming language level a low-latency guarantee by performing computations in a progressive fashion. Moving this progressive computation at the language level relieves the programmer of exploratory data analysis systems from implementing the whole analytics pipeline in a progressive way from scratch, streamlining the implementation of scalable exploratory data analysis systems. This article describes the new paradigm through a prototype implementation called ProgressiVis, and explains the requirements it implies through examples.