10.9HCApr 9
HealthTale: A Patient-Centric Health Story Visualization ToolRyan Smith, Kyle D. Chin, Tamara Munzner
Patients often struggle to communicate coherent accounts of their health histories during time-constrained clinical encounters. These accounts, which we refer to as health stories, include both clinical events and lived experiences. Existing systems prioritize structured, clinician-centered data and provide limited support for eliciting and communicating patient-generated narratives. We present HealthTale, a patient-centric visualization system designed to elicit health stories from patients and structure them to facilitate communication during initial clinical conversations. Its design arises from a multi-stage qualitative investigation across domain expert discussions, online narratives (n=20), patient (n=11) and clinician (n=6) interviews, and elicited health stories (n=22), identifying recurring patterns in how individuals construct and communicate their health stories. HealthTale transforms freeform narratives into structured timeline representations, grounded in a data abstraction that models health stories as events that are grouped by health concern and time, capturing both clinical and contextual information, with the flexibility to handle temporally imprecise data and non-linear distributions of events across time. Through evaluation with patients (n=34) and clinicians (n=3), we find that HealthTale supports recall, organization, and self-advocacy, while enabling clinicians to rapidly interpret patient-generated narratives and establish a shared understanding.
HCApr 19, 2025
Visualization Tasks for Unlabelled GraphsMatt I. B. Oddo, Ryan Smith, Stephen Kobourov et al.
We investigate tasks that can be accomplished with unlabelled graphs, which are graphs with nodes that do not have attached persistent or semantically meaningful labels. New visualization techniques to represent unlabelled graphs have been proposed, but more understanding of unlabelled graph tasks is required before these techniques can be adequately evaluated. Some tasks apply to both labelled and unlabelled graphs, but many do not translate between these contexts. We propose a data abstraction model that distinguishes the Unlabelled context from the increasingly semantically rich Labelled, Attributed, and Augmented contexts. We filter tasks collected and gleaned from the literature according to our data abstraction and analyze the surfaced tasks, leading to a taxonomy of abstract tasks for unlabelled graphs. Our task taxonomy is organized according to the Scope of the data at play, the Action intended by the user, and the Target data under consideration. We show the descriptive power of this task abstraction by connecting to concrete examples from previous frameworks, and connect these abstractions to real-world problems. To showcase the evaluative power of the taxonomy, we perform a preliminary assessment of 6 visualizations for each task. For each combination of task and visual encoding, we consider the effort required from viewers, the likelihood of task success, and how both factors vary between small-scale and large-scale graphs.
LGJun 24, 2021
Visualizing Graph Neural Networks with CorGIE: Corresponding a Graph to Its EmbeddingZipeng Liu, Yang Wang, Jürgen Bernard et al.
Graph neural networks (GNNs) are a class of powerful machine learning tools that model node relations for making predictions of nodes or links. GNN developers rely on quantitative metrics of the predictions to evaluate a GNN, but similar to many other neural networks, it is difficult for them to understand if the GNN truly learns characteristics of a graph as expected. We propose an approach to corresponding an input graph to its node embedding (aka latent space), a common component of GNNs that is later used for prediction. We abstract the data and tasks, and develop an interactive multi-view interface called CorGIE to instantiate the abstraction. As the key function in CorGIE, we propose the K-hop graph layout to show topological neighbors in hops and their clustering structure. To evaluate the functionality and usability of CorGIE, we present how to use CorGIE in two usage scenarios, and conduct a case study with five GNN experts.
DCOct 26, 2020
Aggregate-Driven Trace Visualizations for Performance DebuggingVaastav Anand, Matheus Stolet, Thomas Davidson et al.
Performance issues in cloud systems are hard to debug. Distributed tracing is a widely adopted approach that gives engineers visibility into cloud systems. Existing trace analysis approaches focus on debugging single request correctness issues but not debugging single request performance issues. Diagnosing a performance issue in a given request requires comparing the performance of the offending request with the aggregate performance of typical requests. Effective and efficient debugging of such issues faces three challenges: (i) identifying the correct aggregate data for diagnosis; (ii) visualizing the aggregated data; and (iii) efficiently collecting, storing, and processing trace data. We present TraVista, a tool designed for debugging performance issues in a single trace that addresses these challenges. TraVista extends the popular single trace Gantt chart visualization with three types of aggregate data - metric, temporal, and structure data, to contextualize the performance of the offending trace across all traces.
HCOct 22, 2020
GEViTRec: Data Reconnaissance Through Recommendation Using a Domain-Specific Prevalence Visualization Design SpaceAnamaria Crisan, Shannah Fisher, Jennifer L. Gardy et al.
Genomic Epidemiology (genEpi) is a branch of public health that uses many different data types including tabular, network, genomic, and geographic, to identify and contain outbreaks of deadly diseases. Due to the volume and variety of data, it is challenging for genEpi domain experts to conduct data reconnaissance; that is, have an overview of the data they have and make assessments toward its quality, completeness, and suitability. We present an algorithm for data reconnaissance through automatic visualization recommendation, GEViTRec. Our approach handles a broad variety of dataset types and automatically generates coordinated combinations of charts, in contrast to existing systems that primarily focus on singleton visual encodings of tabular datasets. We automatically detect linkages across multiple input datasets by analyzing non-numeric attribute fields, creating an entity graph within which we analyze and rank paths. For each high-ranking path, we specify chart combinations with spatial and color alignments between shared fields, using a gradual binding approach to transform initial partial specifications of singleton charts to complete specifications that are aligned and oriented consistently. A novel aspect of our approach is its combination of domain-agnostic elements with domain-specific information that is captured through a domain-specific visualization prevalence design space. Our implementation is applied to both synthetic data and real data from an Ebola outbreak. We compare GEViTRec's output to what previous visualization recommendation systems would generate, and to manually crafted visualizations used by practitioners. We conducted formative evaluations with ten genEpi experts to assess the relevance and interpretability of our results.
HCSep 4, 2020
Table Scraps: An Actionable Framework for Multi-Table Data Wrangling From An Artifact Study of Computational JournalismStephen Kasica, Charles Berret, Tamara Munzner
For the many journalists who use data and computation to report the news, data wrangling is an integral part of their work.Despite an abundance of literature on data wrangling in the context of enterprise data analysis, little is known about the specific operations, processes, and pain points journalists encounter while performing this tedious, time-consuming task. To better understand the needs of this user group, we conduct a technical observation study of 50 public repositories of data and analysis code authored by 33 professional journalists at 26 news organizations. We develop two detailed and cross-cutting taxonomies of data wrangling in computational journalism, for actions and for processes. We observe the extensive use of multiple tables, a notable gap in previous wrangling analyses. We develop a concise, actionable framework for general multi-table data wrangling that includes wrangling operations documented in our taxonomy that are without clear parallels in other work. This framework, the first to incorporate tablesas first-class objects, will support future interactive wrangling tools for both computational journalism and general-purpose use. We assess the generative and descriptive power of our framework through discussion of its relationship to our set of taxonomies.
HCSep 3, 2020
Data-First Visualization Design StudiesMichael Oppermann, Tamara Munzner
We introduce the notion of a data-first design study which is triggered by the acquisition of real-world data instead of specific stakeholder analysis questions. We propose an adaptation of the design study methodology framework to provide practical guidance and to aid transferability to other data-first design processes. We discuss opportunities and risks by reflecting on two of our own data-first design studies. We review 64 previous design studies and identify 16 of them as edge cases with characteristics that may indicate a data-first design process in action.
HCAug 18, 2020
VizCommender: Computing Text-Based Similarity in Visualization Repositories for Content-Based RecommendationsMichael Oppermann, Robert Kincaid, Tamara Munzner
Cloud-based visualization services have made visual analytics accessible to a much wider audience than ever before. Systems such as Tableau have started to amass increasingly large repositories of analytical knowledge in the form of interactive visualization workbooks. When shared, these collections can form a visual analytic knowledge base. However, as the size of a collection increases, so does the difficulty in finding relevant information. Content-based recommendation (CBR) systems could help analysts in finding and managing workbooks relevant to their interests. Toward this goal, we focus on text-based content that is representative of the subject matter of visualizations rather than the visual encodings and style. We discuss the challenges associated with creating a CBR based on visualization specifications and explore more concretely how to implement the relevance measures required using Tableau workbook specifications as the source of content data. We also demonstrate what information can be extracted from these visualization specifications and how various natural language processing techniques can be used to compute similarity between workbooks as one way to measure relevance. We report on a crowd-sourced user study to determine if our similarity measure mimics human judgement. Finally, we choose latent Dirichlet allocation (LDA) as a specific model and instantiate it in a proof-of-concept recommender tool to demonstrate the basic function of our similarity measure.
HCJul 13, 2020
LSQT: Low-Stretch Quasi-Trees for Bundling and LayoutRebecca Vandenberg, Madison Elliott, Nicholas Harvey et al.
We introduce low-stretch trees to the visualization community with LSQT, our novel technique that uses quasi-trees for both layout and edge bundling. Our method offers strong computational speed and complexity guarantees by leveraging the convenient properties of low-stretch trees, which accurately reflect the topological structure of arbitrary graphs with superior fidelity compared to arbitrary spanning trees. Low-stretch quasi-trees also have provable sparseness guarantees, providing algorithmic support for aggressive de-cluttering of hairball graphs. LSQT does not rely on previously computed vertex positions and computes bundles based on topological structure before any geometric layout occurs. Edge bundles are computed efficiently and stored in an explicit data structure that supports sophisticated visual encoding and interaction techniques, including dynamic layout adjustment and interactive bundle querying. Our unoptimized implementation handles graphs of over 100,000 edges in eight seconds, providing substantially higher performance than previous approaches.
HCOct 31, 2016
On Regulatory and Organizational Constraints in Visualization Design and EvaluationAnamaria Crisan, Jennifer L. Gardy, Tamara Munzner
Problem-based visualization research provides explicit guidance toward identifying and designing for the needs of users, but absent is more concrete guidance toward factors external to a user's needs that also have implications for visualization design and evaluation. This lack of more explicit guidance can leave visualization researchers and practitioners vulnerable to unforeseen constraints beyond the user's needs that can affect the validity of evaluations, or even lead to the premature termination of a project. Here we explore two types of external constraints in depth, regulatory and organizational constraints, and describe how these constraints impact visualization design and evaluation. By borrowing from techniques in software development, project management, and visualization research we recommend strategies for identifying, mitigating, and evaluating these external constraints through a design study methodology. Finally, we present an application of those recommendations in a healthcare case study. We argue that by explicitly incorporating external constraints into visualization design and evaluation, researchers and practitioners can improve the utility and validity of their visualization solution and improve the likelihood of successful collaborations with industries where external constraints are more present.