HCAug 12, 2022
Scholastic: Graphical Human-Al Collaboration for Inductive and Interpretive Text AnalysisMatt-Heun Hong, Lauren A. Marsh, Jessica L. Feuston et al.
Interpretive scholars generate knowledge from text corpora by manually sampling documents, applying codes, and refining and collating codes into categories until meaningful themes emerge. Given a large corpus, machine learning could help scale this data sampling and analysis, but prior research shows that experts are generally concerned about algorithms potentially disrupting or driving interpretive scholarship. We take a human-centered design approach to addressing concerns around machine-assisted interpretive research to build Scholastic, which incorporates a machine-in-the-loop clustering algorithm to scaffold interpretive text analysis. As a scholar applies codes to documents and refines them, the resulting coding schema serves as structured metadata which constrains hierarchical document and word clusters inferred from the corpus. Interactive visualizations of these clusters can help scholars strategically sample documents further toward insights. Scholastic demonstrates how human-centered algorithm design and visualizations employing familiar metaphors can support inductive and interpretive research methodologies through interactive topic modeling and document clustering.
HCAug 1, 2023
CLAMS: A Cluster Ambiguity Measure for Estimating Perceptual Variability in Visual ClusteringHyeon Jeon, Ghulam Jilani Quadri, Hyunwook Lee et al.
Visual clustering is a common perceptual task in scatterplots that supports diverse analytics tasks (e.g., cluster identification). However, even with the same scatterplot, the ways of perceiving clusters (i.e., conducting visual clustering) can differ due to the differences among individuals and ambiguous cluster boundaries. Although such perceptual variability casts doubt on the reliability of data analysis based on visual clustering, we lack a systematic way to efficiently assess this variability. In this research, we study perceptual variability in conducting visual clustering, which we call Cluster Ambiguity. To this end, we introduce CLAMS, a data-driven visual quality measure for automatically predicting cluster ambiguity in monochrome scatterplots. We first conduct a qualitative study to identify key factors that affect the visual separation of clusters (e.g., proximity or size difference between clusters). Based on study findings, we deploy a regression module that estimates the human-judged separability of two clusters. Then, CLAMS predicts cluster ambiguity by analyzing the aggregated results of all pairwise separability between clusters that are generated by the module. CLAMS outperforms widely-used clustering techniques in predicting ground truth cluster ambiguity. Meanwhile, CLAMS exhibits performance on par with human annotators. We conclude our work by presenting two applications for optimizing and benchmarking data mining techniques using CLAMS. The interactive demo of CLAMS is available at clusterambiguity.dev.
HCJul 28, 2024
A Qualitative Analysis of Common Practices in Annotations: A Taxonomy and Design SpaceMd Dilshadur Rahman, Ghulam Jilani Quadri, Bhavana Doppalapudi et al.
Annotations play a vital role in highlighting critical aspects of visualizations, aiding in data externalization and exploration, collaborative sensemaking, and visual storytelling. However, despite their widespread use, we identified a lack of a design space for common practices for annotations. In this paper, we evaluated over 1,800 static annotated charts to understand how people annotate visualizations in practice. Through qualitative coding of these diverse real-world annotated charts, we explored three primary aspects of annotation usage patterns: analytic purposes for chart annotations (e.g., present, identify, summarize, or compare data features), mechanisms for chart annotations (e.g., types and combinations of annotations used, frequency of different annotation types across chart types, etc.), and the data source used to generate the annotations. We then synthesized our findings into a design space of annotations, highlighting key design choices for chart annotations. We presented three case studies illustrating our design space as a practical framework for chart annotations to enhance the communication of visualization insights. All supplemental materials are available at {https://shorturl.at/bAGM1}.
HCFeb 6
Redundant is Not Redundant: Automating Efficient Categorical Palette Design Unifying Color & Shape Encodings with CatPAWChin Tseng, Arran Zeyu Wang, Ghulam Jilani Quadri et al.
Colors and shapes are commonly used to encode categories in multi-class scatterplots. Designers often combine the two channels to create redundant encodings, aiming to enhance class distinctions. However, evidence for the effectiveness of redundancy remains conflicted, and guidelines for constructing effective combinations are limited. This paper presents four crowdsourced experiments evaluating redundant color-shape encodings and identifying high-performing configurations across different category numbers. Results show that redundancy significantly improves accuracy in assessing class-level correlations, with the strongest benefits for 5-8 categories. We also find pronounced interaction effects between colors and shapes, underscoring the need for careful pairing in designing redundant encodings. Drawing on these findings, we introduce a categorical palette design tool that enables designers to construct empirically grounded palettes for effective categorical visualization. Our work advances understanding of categorical perception in data visualization by systematically identifying effective redundant color-shape combinations and embedding these insights into a practical palette design tool.
HCFeb 25, 2024
Cieran: Designing Sequential Colormaps via In-Situ Active Preference LearningMatt-Heun Hong, Zachary N. Sunberg, Danielle Albers Szafir
Quality colormaps can help communicate important data patterns. However, finding an aesthetically pleasing colormap that looks "just right" for a given scenario requires significant design and technical expertise. We introduce Cieran, a tool that allows any data analyst to rapidly find quality colormaps while designing charts within Jupyter Notebooks. Our system employs an active preference learning paradigm to rank expert-designed colormaps and create new ones from pairwise comparisons, allowing analysts who are novices in color design to tailor colormaps to their data context. We accomplish this by treating colormap design as a path planning problem through the CIELAB colorspace with a context-specific reward model. In an evaluation with twelve scientists, we found that Cieran effectively modeled user preferences to rank colormaps and leveraged this model to create new quality designs. Our work shows the potential of active preference learning for supporting efficient visualization design optimization.
HCFeb 21, 2022
Making Data Tangible: A Cross-disciplinary Design Space for Data PhysicalizationS. Sandra Bae, Clement Zheng, Mary Etta West et al.
Designing a data physicalization requires a myriad of different considerations. Despite the cross-disciplinary nature of these considerations, research currently lacks a synthesis across the different communities data physicalization sits upon, including their approaches, theories, and even terminologies. To bridge these communities synergistically, we present a design space that describes and analyzes physicalizations according to three facets: context (end-user considerations), structure (the physical structure of the artifact), and interactions (interactions with both the artifact and data). We construct this design space through a systematic review of 47 physicalizations and analyze the interrelationships of key factors when designing a physicalization. This design space cross-pollinates knowledge from relevant HCI communities, providing a cohesive overview of what designers should consider when creating a data physicalization while suggesting new design possibilities. We analyze the design decisions present in current physicalizations, discuss emerging trends, and identify underlying open challenges.
HCAug 9, 2021
The Weighted Average Illusion: Biases in Perceived Mean Position in ScatterplotsMatt-Heun Hong, Jessica K. Witt, Danielle Albers Szafir
Scatterplots can encode a third dimension by using additional channels like size or color (e.g. bubble charts). We explore a potential misinterpretation of trivariate scatterplots, which we call the weighted average illusion, where locations of larger and darker points are given more weight toward x- and y-mean estimates. This systematic bias is sensitive to a designer's choice of size or lightness ranges mapped onto the data. In this paper, we quantify this bias against varying size/lightness ranges and data correlations. We discuss possible explanations for its cause by measuring attention given to individual data points using a vision science technique called the centroid method. Our work illustrates how ensemble processing mechanisms and mental shortcuts can significantly distort visual summaries of data, and can lead to misconceptions like the demonstrated weighted average illusion.
HCAug 5, 2021
Professional Differences: A Comparative Study of Visualization Task Performance and Spatial Ability Across DisciplinesKyle Wm. Hall, Anthony Kouroupis, Anastasia Bezerianos et al.
Problem-driven visualization work is rooted in deeply understanding the data, actors, processes, and workflows of a target domain. However, an individual's personality traits and cognitive abilities may also influence visualization use. Diverse user needs and abilities raise natural questions for specificity in visualization design: Could individuals from different domains exhibit performance differences when using visualizations? Are any systematic variations related to their cognitive abilities? This study bridges domain-specific perspectives on visualization design with those provided by cognition and perception. We measure variations in visualization task performance across chemistry, computer science, and education, and relate these differences to variations in spatial ability. We conducted an online study with over 60 domain experts consisting of tasks related to pie charts, isocontour plots, and 3D scatterplots, and grounded by a well-documented spatial ability test. Task performance (correctness) varied with profession across more complex visualizations, but not pie charts, a comparatively common visualization. We found that correctness correlates with spatial ability, and the professions differ in terms of spatial ability. These results indicate that domains differ not only in the specifics of their data and tasks, but also in terms of how effectively their constituent members engage with visualizations and their cognitive traits. Analyzing participants' confidence and strategy comments suggests that focusing on performance neglects important nuances, such as differing approaches to engage with even common visualizations and potential skill transference. Our findings offer a fresh perspective on discipline-specific visualization with recommendations to help guide visualization design that celebrates the uniqueness of the disciplines and individuals we seek to serve.
HCAug 1, 2019
Color Crafting: Automating the Construction of Designer Quality Color RampsStephen Smart, Keke Wu, Danielle Albers Szafir
Visualizations often encode numeric data using sequential and diverging color ramps. Effective ramps use colors that are sufficiently discriminable, align well with the data, and are aesthetically pleasing. Designers rely on years of experience to create high-quality color ramps. However, it is challenging for novice visualization developers that lack this experience to craft effective ramps as most guidelines for constructing ramps are loosely defined qualitative heuristics that are often difficult to apply. Our goal is to enable visualization developers to readily create effective color encodings using a single seed color. We do this using an algorithmic approach that models designer practices by analyzing patterns in the structure of designer-crafted color ramps. We construct these models from a corpus of 222 expert-designed color ramps, and use the results to automatically generate ramps that mimic designer practices. We evaluate our approach through an empirical study comparing the outputs of our approach with designer-crafted color ramps. Our models produce ramps that support accurate and aesthetically pleasing visualizations at least as well as designer ramps and that outperform conventional mathematical approaches.