Analytic Provenance Datasets: A Data Repository of Human Analysis Activity and Interaction Logs
This provides a resource for researchers in data analysis and human-computer interaction to study exploratory processes, though it is incremental as it focuses on data collection rather than new methods.
The authors tackled the problem of studying human analysis activity and software interaction during exploratory data analysis by creating a repository of analytic provenance datasets, including interaction logs, think-alouds, and videos from user studies with textual and cyber security data, enabling research on tools and techniques for analyzing interaction logs and comparing algorithmic methods to ground-truth records.
We present an analytic provenance data repository that can be used to study human analysis activity, thought processes, and software interaction with visual analysis tools during exploratory data analysis. We conducted a series of user studies involving exploratory data analysis scenario with textual and cyber security data. Interactions logs, think-alouds, videos and all coded data in this study are available online for research purposes. Analysis sessions are segmented in multiple sub-task steps based on user think-alouds, video and audios captured during the studies. These analytic provenance datasets can be used for research involving tools and techniques for analyzing interaction logs and analysis history. By providing high-quality coded data along with interaction logs, it is possible to compare algorithmic data processing techniques to the ground-truth records of analysis history.