HCCLJan 16, 2025

Augmenting a Large Language Model with a Combination of Text and Visual Data for Conversational Visualization of Global Geospatial Data

arXiv:2501.09521v13 citationsh-index: 18
Originality Incremental advance
AI Analysis

This addresses the challenge of visual data interaction for researchers and analysts in scientific domains, though it is incremental as it builds on existing LLM capabilities.

The authors tackled the problem of LLMs lacking contextual visual information for accurate question answering in scientific data visualization by augmenting an LLM with a combination of text and visual data, enabling conversational visualization without fine-tuning.

We present a method for augmenting a Large Language Model (LLM) with a combination of text and visual data to enable accurate question answering in visualization of scientific data, making conversational visualization possible. LLMs struggle with tasks like visual data interaction, as they lack contextual visual information. We address this problem by merging a text description of a visualization and dataset with snapshots of the visualization. We extract their essential features into a structured text file, highly compact, yet descriptive enough to appropriately augment the LLM with contextual information, without any fine-tuning. This approach can be applied to any visualization that is already finally rendered, as long as it is associated with some textual description.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes