46.3HCMay 29
Agentic Authoring of Interactive Multiview Visualizations in GenomicsAstrid van den Brandt, Kiroong Choe, Sehi L'Yi et al.
Diverse genomics data, scientific questions, and analysis tasks typically demand highly specialized visualizations. Therefore, users often must customize or author new ones tailored to their data. Existing tools are usually either limited in customization or require substantial learning or programming, and even expressive tools assume visualization expertise many users lack. Agentic and large language model (LLM) approaches are increasingly applied to complex scientific tasks, including visualization. Natural-language conversational interfaces offer a promising path to democratizing the authoring of complex visualizations. In the context of genomics, these approaches face additional challenges: genomics visualizations typically integrate heterogeneous data types and are composed of multiple linked interactive views. These challenges motivate more structured LLM-based schemes. We first characterize where vanilla LLM generation succeeds and fails for genomics visualization, identifying eight quality dimensions. We then compare six schemes--direct generation, a fixed pipeline, and four agentic configurations varying in the number of specialist agents and the presence of a reviewer--across 159 cases spanning three levels of query ambiguity and specification complexity. All schemes use the Gosling visualization grammar as structured output. Agentic iteration substantially improves perceived quality over both baselines, while more complex agent architectures yield no additional benefit. We discuss implications for designing agentic systems for domain-specific visualization authoring. All supplemental materials are available at https://osf.io/uqe83.
25.7HCMay 9
Sycamore: Characterizing Synthetic Personas for Evaluating Genomics Visualization RetrievalHuyen N. Nguyen, Astrid van den Brandt, Nils Gehlenborg
Evaluating visualization systems in niche domains such as genomics is challenging due to scarcity of domain experts and difficulty recruiting a representative user base. While LLM-based synthetic personas are increasingly used to ease evaluation bottlenecks, they face well-founded skepticism. Rather than weighing synthetic personas as substitutes for real users, we ask a fundamental open question: when synthetic personas evaluate a real visualization system, what do they actually produce, and how does that output change when grounded in documented human contexts? We present Sycamore, an exploratory three-condition probe design using Geranium, a search engine for multimodal genomics visualization, as a case study. Sycamore evaluates Geranium using: (1) ungrounded synthetic personas from generic LLM priors; (2) grounded synthetic personas constrained by voice-of-customer artifacts from a prior interview study; and (3) a published baseline study of real domain experts. We observe that grounding shifts synthetic feedback toward the language and concerns of documented users, while ungrounded evaluators drift toward operational specifics that real participants did not raise; both synthetic conditions, however, converge on a find-and-adapt frame and miss the image-modality preference observed in the expert study. We discuss what these observations imply for where synthetic personas might fit alongside expert studies in domain-specific visualization evaluation. All supplemental materials are available at https://osf.io/kdfr3/.