Benjamin Bach

h-index48

13papers

1,208citations

Novelty29%

AI Score30

Ranked #139,813 of 194,257 authors (top 72%)#1,231 in HC (top 49%)

13 Papers

8.3HCMay 1, 2024

ChatGPT in Data Visualization Education: A Student Perspective

Nam Wook Kim, Hyung-Kwon Ko, Grace Myers et al.

Unlike traditional educational chatbots that rely on pre-programmed responses, large-language model-driven chatbots, such as ChatGPT, demonstrate remarkable versatility to serve as a dynamic resource for addressing student needs from understanding advanced concepts to solving complex problems. This work explores the impact of such technology on student learning in an interdisciplinary, project-oriented data visualization course. Throughout the semester, students engaged with ChatGPT across four distinct projects, designing and implementing data visualizations using a variety of tools such as Tableau, D3, and Vega-lite. We collected conversation logs and reflection surveys after each assignment and conducted interviews with selected students to gain deeper insights into their experiences with ChatGPT. Our analysis examined the advantages and barriers of using ChatGPT, students' querying behavior, the types of assistance sought, and its impact on assignment outcomes and engagement. We discuss design considerations for an educational solution tailored for data visualization education, extending beyond ChatGPT's basic interface.

2.7CLApr 1, 2025Code

Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals

Lucy Havens, Benjamin Bach, Melissa Terras et al.

Despite numerous efforts to mitigate their biases, ML systems continue to harm already-marginalized people. While predominant ML approaches assume bias can be removed and fair models can be created, we show that these are not always possible, nor desirable, goals. We reframe the problem of ML bias by creating models to identify biased language, drawing attention to a dataset's biases rather than trying to remove them. Then, through a workshop, we evaluated the models for a specific use case: workflows of information and heritage professionals. Our findings demonstrate the limitations of ML for identifying bias due to its contextual nature, the way in which approaches to mitigating it can simultaneously privilege and oppress different communities, and its inevitability. We demonstrate the need to expand ML approaches to bias and fairness, providing a mixed-methods approach to investigating the feasibility of removing bias or achieving fairness in a given ML use case.

2.7HCFeb 5, 2024

Feature-Action Design Patterns for Storytelling Visualizations with Time Series Data

Saiful Khan, Scott Jones, Benjamin Bach et al.

We present a method to create storytelling visualization with time series data. Many personal decisions nowadays rely on access to dynamic data regularly, as we have seen during the COVID-19 pandemic. It is thus desirable to construct storytelling visualization for dynamic data that is selected by an individual for a specific context. Because of the need to tell data-dependent stories, predefined storyboards based on known data cannot accommodate dynamic data easily nor scale up to many different individuals and contexts. Motivated initially by the need to communicate time series data during the COVID-19 pandemic, we developed a novel computer-assisted method for meta-authoring of stories, which enables the design of storyboards that include feature-action patterns in anticipation of potential features that may appear in dynamically arrived or selected data. In addition to meta-storyboards involving COVID-19 data, we also present storyboards for telling stories about progress in a machine learning workflow. Our approach is complementary to traditional methods for authoring storytelling visualization, and provides an efficient means to construct data-dependent storyboards for different data-streams of similar contexts.

2.9HCFeb 22, 2022

GAN'SDA Wrap: Geographic And Network Structured Data on surfaces that Wrap around

Kun-Ting Chen, Tim Dwyer, Yalong Yang et al.

There are many methods for projecting spherical maps onto the plane. Interactive versions of these projections allow the user to centre the region of interest. However, the effects of such interaction have not previously been evaluated. In a study with 120 participants we find interaction provides significantly more accurate area, direction and distance estimation in such projections. The surface of 3D sphere and torus topologies provides a continuous surface for uninterrupted network layout. But how best to project spherical network layouts to 2D screens has not been studied, nor have such spherical network projections been compared to torus projections. Using the most successful interactive sphere projections from our first study, we compare spherical, standard and toroidal layouts of networks for cluster and path following tasks with 96 participants, finding benefits for both spherical and toroidal layouts over standard network layouts in terms of accuracy for cluster understanding tasks.

12.0HCAug 13, 2021

Visual Arrangements of Bar Charts Influence Comparisons in Viewer Takeaways

Cindy Xiong, Vidya Setlur, Benjamin Bach et al.

Well-designed data visualizations can lead to more powerful and intuitive processing by a viewer. To help a viewer intuitively compare values to quickly generate key takeaways, visualization designers can manipulate how data values are arranged in a chart to afford particular comparisons. Using simple bar charts as a case study, we empirically tested the comparison affordances of four common arrangements: vertically juxtaposed, horizontally juxtaposed, overlaid, and stacked. We asked participants to type out what patterns they perceived in a chart, and coded their takeaways into types of comparisons. In a second study, we asked data visualization design experts to predict which arrangement they would use to afford each type of comparison and found both alignments and mismatches with our findings. These results provide concrete guidelines for how both human designers and automatic chart recommendation systems can make visualizations that help viewers extract the 'right' takeaway.

8.6HCJul 19, 2021Code

Propagating Visual Designs to Numerous Plots and Dashboards

Saiful Khan, Phong H. Nguyen, Alfie Abdul-Rahman et al.

In the process of developing an infrastructure for providing visualization and visual analytics (VIS) tools to epidemiologists and modeling scientists, we encountered a technical challenge for applying a number of visual designs to numerous datasets rapidly and reliably with limited development resources. In this paper, we present a technical solution to address this challenge. Operationally, we separate the tasks of data management, visual designs, and plots and dashboard deployment in order to streamline the development workflow. Technically, we utilize: an ontology to bring datasets, visual designs, and deployable plots and dashboards under the same management framework; multi-criteria search and ranking algorithms for discovering potential datasets that match a visual design; and a purposely-design user interface for propagating each visual design to appropriate datasets (often in tens and hundreds) and quality-assuring the propagation before the deployment. This technical solution has been used in the development of the RAMPVIS infrastructure for supporting a consortium of epidemiologists and modeling scientists through visualization.

12.0HCMar 15, 2021

The Public Life of Data: Investigating Reactions to Visualizations on Reddit

Tobias Kauer, Arran Ridley, Marian Dörk et al.

This research investigates how people engage with data visualizations when commenting on the social platform Reddit. There has been considerable research on collaborative sensemaking with visualizations and the personal relation of people with data. Yet, little is known about how public audiences without specific expertise and shared incentives openly express their thoughts, feelings, and insights in response to data visualizations. Motivated by the extensive social exchange around visualizations in online communities, this research examines characteristics and motivations of people's reactions to posts featuring visualizations. Following a Grounded Theory approach, we study 475 reactions from the /r/dataisbeautiful community, identify ten distinguishable reaction types, and consider their contribution to the discourse. A follow-up survey with 168 Reddit users clarified their intentions to react. Our results help understand the role of personal perspectives on data and inform future interfaces that integrate audience reactions into visualizations to foster a public discourse about data.

8.6HCJan 15, 2021

Visualizing and Interacting with Geospatial Networks: A Survey and Design Space

Sarah Schöttler, Yalong Yang, Hanspeter Pfister et al.

This paper surveys visualization and interaction techniques for geospatial networks from a total of 95 papers. Geospatial networks are graphs where nodes and links can be associated with geographic locations. Examples can include social networks, trade and migration, as well as traffic and transport networks. Visualizing geospatial networks poses numerous challenges around the integration of both network and geographical information as well as additional information such as node and link attributes, time, and uncertainty. Our overview analyzes existing techniques along four dimensions: i) the representation of geographical information, ii) the representation of network information, iii) the visual integration of both, and iv) the use of interaction. These four dimensions allow us to discuss techniques with respect to the trade-offs they make between showing information across all these dimensions and how they solve the problem of showing as much information as necessary while maintaining readability of the visualization. https://geonetworks.github.io.

7.9HCDec 8, 2020

RAMPVIS: Towards a New Methodology for Developing Visualisation Capabilities for Large-scale Emergency Responses

M. Chen, A. Abdul-Rahman, D. Archambault et al.

The effort for combating the COVID-19 pandemic around the world has resulted in a huge amount of data, e.g., from testing, contact tracing, modelling, treatment, vaccine trials, and more. In addition to numerous challenges in epidemiology, healthcare, biosciences, and social sciences, there has been an urgent need to develop and provide visualisation and visual analytics (VIS) capacities to support emergency responses under difficult operational conditions. In this paper, we report the experience of a group of VIS volunteers who have been working in a large research and development consortium and providing VIS support to various observational, analytical, model-developmental and disseminative tasks. In particular, we describe our approaches to the challenges that we have encountered in requirements analysis, data acquisition, visual design, software design, system development, team organisation, and resource planning. By reflecting on our experience, we propose a set of recommendations as the first step towards a methodology for developing and providing rapid VIS capacities to support emergency responses.

31.1CLNov 11, 2020

Situated Data, Situated Systems: A Methodology to Engage with Power Relations in Natural Language Processing Research

Lucy Havens, Melissa Terras, Benjamin Bach et al.

We propose a bias-aware methodology to engage with power relations in natural language processing (NLP) research. NLP research rarely engages with bias in social contexts, limiting its ability to mitigate bias. While researchers have recommended actions, technical methods, and documentation practices, no methodology exists to integrate critical reflections on bias with technical NLP methods. In this paper, after an extensive and interdisciplinary literature review, we contribute a bias-aware methodology for NLP research. We also contribute a definition of biased text, a discussion of the implications of biased NLP systems, and a case study demonstrating how we are executing the bias-aware methodology in research on archival metadata descriptions.

9.6HCOct 18, 2020

Studying Visualization Guidelines According to Grounded Theory

Alexandra Diehl, Matthias Kraus, Alfie Abdul-Rahman et al.

Visualization guidelines, if defined properly, are invaluable to both practical applications and the theoretical foundation of visualization. In this paper, we present a collection of research activities for studying visualization guidelines according to Grounded Theory (GT). We used the discourses at VisGuides, which is an online discussion forum for visualization guidelines, as the main data source for enabling data-driven research processes as advocated by the grounded theory methodology. We devised a categorization scheme focusing on observing how visualization guidelines were featured in different threads and posts at VisGuides, and coded all 248 posts between September 27, 2017 (when VisGuides was first launched) and March 13, 2019. To complement manual categorization and coding, we used text analysis and visualization to help reveal patterns that may have been missed by the manual effort and summary statistics. To facilitate theoretical sampling and negative case analysis, we made an in-depth analysis of the 148 posts (with both questions and replies) related to a student assignment of a visualization course. Inspired by two discussion threads at VisGuides, we conducted two controlled empirical studies to collect further data to validate specific visualization guidelines. Through these activities guided by grounded theory, we have obtained some new findings about visualization guidelines.

22.4HCAug 17, 2020

What Makes a Data-GIF Understandable?

Xinhuan Shu, Aoyu Wu, Junxiu Tang et al.

GIFs are enjoying increasing popularity on social media as a format for data-driven storytelling with visualization; simple visual messages are embedded in short animations that usually last less than 15 seconds and are played in automatic repetition. In this paper, we ask the question, "What makes a data-GIF understandable?" While other storytelling formats such as data videos, infographics, or data comics are relatively well studied, we have little knowledge about the design factors and principles for "data-GIFs". To close this gap, we provide results from semi-structured interviews and an online study with a total of 118 participants investigating the impact of design decisions on the understandability of data-GIFs. The study and our consequent analysis are informed by a systematic review and structured design space of 108 data-GIFs that we found online. Our results show the impact of design dimensions from our design space such as animation encoding, context preservation, or repetition on viewers' understanding of the GIF's core message. The paper concludes with a list of suggestions for creating more effective Data-GIFs.

14.7HCMay 1, 2020Code

A Generic Framework and Library for Exploration of Small Multiples through Interactive Piling

Fritz Lekschas, Xinyi Zhou, Wei Chen et al.

Small multiples are miniature representations of visual information used generically across many domains. Handling large numbers of small multiples imposes challenges on many analytic tasks like inspection, comparison, navigation, or annotation. To address these challenges, we developed a framework and implemented a library called Piling.js for designing interactive piling interfaces. Based on the piling metaphor, such interfaces afford flexible organization, exploration, and comparison of large numbers of small multiples by interactively aggregating visual objects into piles. Based on a systematic analysis of previous work, we present a structured design space to guide the design of visual piling interfaces. To enable designers to efficiently build their own visual piling interfaces, Piling.js provides a declarative interface to avoid having to write low-level code and implements common aspects of the design space. An accompanying GUI additionally supports the dynamic configuration of the piling interface. We demonstrate the expressiveness of Piling.js with examples from machine learning, immunofluorescence microscopy, genomics, and public health.