Cagatay Demiralp

HC
5papers
75citations
Novelty32%
AI Score38

5 Papers

52.6AIApr 8
Bridging Natural Language and Interactive What-If Interfaces via LLM-Generated Declarative Specification

Sneha Gathani, Sirui Zeng, Diya Patel et al. · mit

What-if analysis (WIA) is an iterative, multi-step process where users explore and compare hypothetical scenarios by adjusting parameters, applying constraints, and scoping data through interactive interfaces. Current tools fall short of supporting effective interactive WIA: spreadsheet and BI tools require time-consuming and laborious setup, while LLM-based chatbot interfaces are semantically fragile, frequently misinterpret intent, and produce inconsistent results as conversations progress. To address these limitations, we present a two-stage workflow that translates natural language (NL) WIA questions into interactive visual interfaces via an intermediate representation, powered by the Praxa Specification Language (PSL): first, LLMs generate PSL specifications from NL questions capturing analytical intent and logic, enabling validation and repair of erroneous specifications; and second, the specifications are compiled into interactive visual interfaces with parameter controls and linked visualizations. We benchmark this workflow with 405 WIA questions spanning 11 WIA types, 5 datasets, and 3 state-of-the-art LLMs. The results show that across models, half of specifications (52.42%) are generated correctly without intervention. We perform an analysis of the failure cases and derive an error taxonomy spanning non-functional errors (specifications fail to compile) and functional errors (specifications compile but misrepresent intent). Based on the taxonomy, we apply targeted repairs on the failure cases using few-shot prompts and improve the success rate to 80.42%. Finally, we show how undetected functional errors propagate through compilation into plausible but misleading interfaces, demonstrating that the intermediate specification is critical for reliably bridging NL and interactive WIA interface in LLM-powered WIA systems.

59.5HCApr 8
PRAXA: A Grammar for What-If Analysis

Sneha Gathani, Kevin Li, Raghav Thind et al. · mit

What-if analysis is widely used to explore hypothetical scenarios and evaluate alternative pathways to desired results. However, current approaches are fragmented: systems implement what-if capabilities under diverse terminologies with different analytic techniques. Such fragmentation limits expressiveness, impedes flexible composition and reuse of workflows, and hinders tighter integration with AI. We present PRAXA, a compositional grammar of what-if analysis derived from recurring patterns across 141 publications in visual analytics and HCI venues. PRAXA formulates three primitives: (1) data, defining variables under analysis, (2) model, specifying predictive mechanisms, and (3) interaction operations-pairs of user actions and system responses that execute analyses. We encode PRAXA into a declarative specification language, PSL. To evaluate PRAXA, we first show expressiveness by reconstructing representative workflows from prior work as structured compositions, exposing the predominant focus on single-step rather than multi-step reasoning. Second, we demonstrate composability by revealing that capabilities described under distinct terminologies share the same grammatical structure with different parameterizations, and that new multi-step workflows emerge through composition. Third, we illustrate PSL as an intermediate representation for translating natural-language what-if queries into executable interactive interfaces, enabling inspection, validation, and more transparent AI integration. By unifying diverse what-if approaches as a grammar, PRAXA provides a foundation for analyzing, composing, and supporting workflows in next-generation what-if systems.

CLApr 6, 2020
Enhancing Review Comprehension with Domain-Specific Commonsense

Aaron Traylor, Chen Chen, Behzad Golshan et al.

Review comprehension has played an increasingly important role in improving the quality of online services and products and commonsense knowledge can further enhance review comprehension. However, existing general-purpose commonsense knowledge bases lack sufficient coverage and precision to meaningfully improve the comprehension of domain-specific reviews. In this paper, we introduce xSense, an effective system for review comprehension using domain-specific commonsense knowledge bases (xSense KBs). We show that xSense KBs can be constructed inexpensively and present a knowledge distillation method that enables us to use xSense KBs along with BERT to boost the performance of various review comprehension tasks. We evaluate xSense over three review comprehension tasks: aspect extraction, aspect sentiment classification, and question answering. We find that xSense outperforms the state-of-the-art models for the first two tasks and improves the baseline BERT QA model significantly, demonstrating the usefulness of incorporating commonsense into review comprehension pipelines. To facilitate future research and applications, we publicly release three domain-specific knowledge bases and a domain-specific question answering benchmark along with this paper.

HCJul 17, 2018
Beyond Heuristics: Learning Visualization Design

Bahador Saket, Dominik Moritz, Halden Lin et al.

In this paper, we describe a research agenda for deriving design principles directly from data. We argue that it is time to go beyond manually curated and applied visualization design guidelines. We propose learning models of visualization design from data collected using graphical perception studies and build tools powered by the learned models. To achieve this vision, we need to 1) develop scalable methods for collecting training data, 2) collect different forms of training data, 3) advance interpretability of machine learning models, and 4) develop adaptive models that evolve as more data becomes available.

HCSep 25, 2017
Task-Based Effectiveness of Basic Visualizations

Bahador Saket, Alex Endert, Cagatay Demiralp

Visualizations of tabular data are widely used; understanding their effectiveness in different task and data contexts is fundamental to scaling their impact. However, little is known about how basic tabular data visualizations perform across varying data analysis tasks and data attribute types. In this paper, we report results from a crowdsourced experiment to evaluate the effectiveness of five visualization types --- Table, Line Chart, Bar Chart, Scatterplot, and Pie Chart --- across ten common data analysis tasks and three data attribute types using two real-world datasets. We found the effectiveness of these visualization types significantly varies across task and data attribute types, suggesting that visualization design would benefit from considering context dependent effectiveness. Based on our findings, we derive recommendations on which visualizations to choose based on different tasks. We finally train a decision tree on the data we collected to drive a recommender, showcasing how to effectively engineer experimental user data into practical visualization systems.