Ivan Viola

h-index31

14papers

86citations

Novelty50%

AI Score50

Ranked #20,498 of 194,257 authors (top 11%)#74 in HC (top 3%)

14 Papers

1.2QMMay 8, 2022

Differentiable Electron Microscopy Simulation: Methods and Applications for Visualization

Ngan Nguyen, Feng Liang, Dominik Engel et al.

We propose a new microscopy simulation system that can depict atomistic models in a micrograph visual style, similar to results of physical electron microscopy imaging. This system is scalable, able to represent simulation of electron microscopy of tens of viral particles and synthesizes the image faster than previous methods. On top of that, the simulator is differentiable, both its deterministic as well as stochastic stages that form signal and noise representations in the micrograph. This notable property has the capability for solving inverse problems by means of optimization and thus allows for generation of microscopy simulations using the parameter settings estimated from real data. We demonstrate this learning capability through two applications: (1) estimating the parameters of the modulation transfer function defining the detector properties of the simulated and real micrographs, and (2) denoising the real data based on parameters trained from the simulated examples. While current simulators do not support any parameter estimation due to their forward design, we show that the results obtained using estimated parameters are very similar to the results of real micrographs. Additionally, we evaluate the denoising capabilities of our approach and show that the results showed an improvement over state-of-the-art methods. Denoised micrographs exhibit less noise in the tilt-series tomography reconstructions, ultimately reducing the visual dominance of noise in direct volume rendering of microscopy tomograms.

6.9HCJan 22, 2024

VOICE: Visual Oracle for Interaction, Conversation, and Explanation

Donggang Jia, Alexandra Irger, Lonni Besancon et al.

We present VOICE, a novel approach to science communication that connects large language models' (LLM) conversational capabilities with interactive exploratory visualization. VOICE introduces several innovative technical contributions that drive our conversational visualization framework. Our foundation is a pack-of-bots that can perform specific tasks, such as assigning tasks, extracting instructions, and generating coherent content. We employ fine-tuning and prompt engineering techniques to tailor bots' performance to their specific roles and accurately respond to user queries. Our interactive text-to-visualization method generates a flythrough sequence matching the content explanation. Besides, natural language interaction provides capabilities to navigate and manipulate the 3D models in real-time. The VOICE framework can receive arbitrary voice commands from the user and respond verbally, tightly coupled with corresponding visual representation with low latency and high accuracy. We demonstrate the effectiveness of our approach by applying it to the molecular visualization domain: analyzing three 3D molecular models with multi-scale and multi-instance attributes. We finally evaluate VOICE with the identified educational experts to show the potential of our approach. All supplemental materials are available at https://osf.io/g7fbr.

1.2GRApr 6, 2023

Dr. KID: Direct Remeshing and K-set Isometric Decomposition for Scalable Physicalization of Organic Shapes

Dawar Khan, Ciril Bohak, Ivan Viola

Dr. KID is an algorithm that uses isometric decomposition for the physicalization of potato-shaped organic models in a puzzle fashion. The algorithm begins with creating a simple, regular triangular surface mesh of organic shapes, followed by iterative k-means clustering and remeshing. For clustering, we need similarity between triangles (segments) which is defined as a distance function. The distance function maps each triangle's shape to a single point in the virtual 3D space. Thus, the distance between the triangles indicates their degree of dissimilarity. K-means clustering uses this distance and sorts of segments into k classes. After this, remeshing is applied to minimize the distance between triangles within the same cluster by making their shapes identical. Clustering and remeshing are repeated until the distance between triangles in the same cluster reaches an acceptable threshold. We adopt a curvature-aware strategy to determine the surface thickness and finalize puzzle pieces for 3D printing. Identical hinges and holes are created for assembling the puzzle components. For smoother outcomes, we use triangle subdivision along with curvature-aware clustering, generating curved triangular patches for 3D printing. Our algorithm was evaluated using various models, and the 3D-printed results were analyzed. Findings indicate that our algorithm performs reliably on target organic shapes with minimal loss of input geometry.

1.2QMApr 18, 2022

SynopSet: Multiscale Visual Abstraction Set for Explanatory Analysis of DNA Nanotechnology Simulations

Deng Luo, Alexandre Kouyoumdjian, Ondřej Strnad et al.

We propose a new abstraction set (SynopSet) that has a continuum of visual representations for the explanatory analysis of molecular dynamics simulations (MDS) in the DNA nanotechnology domain. By re-purposing the commonly used progress bar and designing novel visuals, as well as transforming the data from the domain format to a format that better fits the newly designed visuals, we compose this new set of representations. This set is also designed to be capable of showing all spatial and temporal details, and all structural complexity, or abstracting these to various degrees, enabling both the slow playback of the simulation for detailed examinations or very fast playback for an overview that helps to efficiently identify events of interest, as well as several intermediate levels between these two extremes. For any pair of successive representations, we demonstrate smooth, continuous transitions, enabling users to keep track of relevant information from one representation to the next. By providing multiple representations suited to different temporal resolutions and connected by smooth transitions, we enable time-efficient simulation analysis, giving users the opportunity to examine and present important phases in great detail, or leverage abstract representations to go over uneventful phases much faster. Domain experts can thus gain actionable insight about their simulations and communicate it in a much shorter time. Further, the novel representations are more intuitive and also enable researchers unfamiliar with MDS analysis graphs to better understand the simulation results. We assessed the effectiveness of SynopSet on 12 DNA nanostructure simulations together with a domain expert. We have also shown that our set of representations can be systematically located in a visualization space, dubbed SynopSpace.

7.4HCMay 19

Chat Modeling: Interaction-Enhanced Agent Framework for Visualizing Literature-Grounded Biological Structures

Donggang Jia, Yunhai Wang, Ivan Viola

Bioscientists frequently seek to visualize the biological systems they have empirically characterized and reported in the literature. Realizing such visualizations requires biological structure modeling, an inherently complex process that demands both biological and geometric understanding. This paper addresses the problem of constructing such 3D models for visualization. In this paper, we introduce a novel agent framework that mitigates the challenges of operating 3D modeling software by transforming user inputs, including natural language descriptions, research publication content, and textual descriptions of the existing objects and structures in the current scene, into modeling operations in a structured JSON format and final 3D results. The major technical contribution lies in the collaborative agent design that simultaneously supports model planning, execution, and novel user interaction design, such as interactive modeling execution and dynamic widget generation that fuse text and mouse interaction within the chat window. The framework further incorporates a customized modeling memory to enhance user interaction, featuring components such as personalized memory management, feedback collection, and skill library design. This modeling memory is leveraged to enable improved 3D modeling performance over time. The quantitative evaluation on our collected dataset showcases the effectiveness of our framework. We also develop a prototype tool, Chat Modeling, and demonstrate its usage through two modeling case studies. Our user study and expert interviews highlight the potential of our approach for use in scientific workflows.

7.0CVApr 7

EfficientMonoHair: Fast Strand-Level Reconstruction from Monocular Video via Multi-View Direction Fusion

Da Li, Dominik Engel, Deng Luo et al.

Strand-level hair geometry reconstruction is a fundamental problem in virtual human modeling and the digitization of hairstyles. However, existing methods still suffer from a significant trade-off between accuracy and efficiency. Implicit neural representations can capture the global hair shape but often fail to preserve fine-grained strand details, while explicit optimization-based approaches achieve high-fidelity reconstructions at the cost of heavy computation and poor scalability. To address this issue, we propose EfficientMonoHair, a fast and accurate framework that combines the implicit neural network with multi-view geometric fusion for strand-level reconstruction from monocular video. Our method introduces a fusion-patch-based multi-view optimization that reduces the number of optimization iterations for point cloud direction, as well as a novel parallel hair-growing strategy that relaxes voxel occupancy constraints, allowing large-scale strand tracing to remain stable and robust even under inaccurate or noisy orientation fields. Extensive experiments on representative real-world hairstyles demonstrate that our method can robustly reconstruct high-fidelity strand geometries with accuracy. On synthetic benchmarks, our method achieves reconstruction quality comparable to state-of-the-art methods, while improving runtime efficiency by nearly an order of magnitude.

4.9GRMar 31

ARCOL: Aspect Ratio Constrained Orthogonal Layout

Zainab Alsuwaykit, Yousef Rajeh, Alexandre Kouyoumdjian et al.

Orthogonal graph layout algorithms aim to produce clear, compact, and readable network diagrams by arranging nodes and edges along horizontal and vertical lines, while minimizing bends and crossings. Most existing orthogonal layout methods focus primarily on quality criteria such as area usage, total edge length, and bend minimization. Explicitly controlling the global aspect ratio (AR) of the resulting layout is as of now unexplored. Existing orthogonal layout methods offer no control over the resulting AR and their rigid geometric constraints make adaptation of finished layouts difficult. With the increasing variety of aspect ratios encountered in daily life, from wide monitors to tall mobile devices or fixed-size interface panels, there is a clear need for aspect ratio control in orthogonal layout methods. To tackle this issue, we introduce Aspect Ratio-Constrained Orthogonal Layout (ARCOL). Building upon the Human-like Orthogonal Layout Algorithm (HOLA)~\cite{Kieffer2016}, we integrate aspect ratio at two different stages: (1) into the stress minimization phase, as a soft constraint, allowing the layout algorithm to gently guide node positions toward a specified target AR, while preserving visual clarity and topological faithfulness; and (2) into the tree reattachment phase, where we modify the cost function to favor placements that improve the AR. We evaluate our approach through quantitative evaluation and a user study, as well as expert interviews. Our evaluations show that ARCOL produces balanced and space efficient orthogonal layouts across diverse aspect ratios.

6.6CVApr 6

ClickAIXR: On-Device Multimodal Vision-Language Interaction with Real-World Objects in Extended Reality

Dawar Khan, Alexandre Kouyoumdjian, Xinyu Liu et al.

We present ClickAIXR, a novel on-device framework for multimodal vision-language interaction with objects in extended reality (XR). Unlike prior systems that rely on cloud-based AI (e.g., ChatGPT) or gaze-based selection (e.g., GazePointAR), ClickAIXR integrates an on-device vision-language model (VLM) with a controller-based object selection paradigm, enabling users to precisely click on real-world objects in XR. Once selected, the object image is processed locally by the VLM to answer natural language questions through both text and speech. This object-centered interaction reduces ambiguity inherent in gaze- or voice-only interfaces and improves transparency by performing all inference on-device, addressing concerns around privacy and latency. We implemented ClickAIXR in the Magic Leap SDK (C API) with ONNX-based local VLM inference. We conducted a user study comparing ClickAIXR with Gemini 2.5 Flash and ChatGPT 5, evaluating usability, trust, and user satisfaction. Results show that latency is moderate and user experience is acceptable. Our findings demonstrate the potential of click-based object selection combined with on-device AI to advance trustworthy, privacy-preserving XR interactions. The source code and supplementary materials are available at: nanovis.org/ClickAIXR.html

7.2HCJan 16, 2025

Augmenting a Large Language Model with a Combination of Text and Visual Data for Conversational Visualization of Global Geospatial Data

Omar Mena, Alexandre Kouyoumdjian, Lonni Besançon et al.

We present a method for augmenting a Large Language Model (LLM) with a combination of text and visual data to enable accurate question answering in visualization of scientific data, making conversational visualization possible. LLMs struggle with tasks like visual data interaction, as they lack contextual visual information. We address this problem by merging a text description of a visualization and dataset with snapshots of the visualization. We extract their essential features into a structured text file, highly compact, yet descriptive enough to appropriately augment the LLM with contextual information, without any fine-tuning. This approach can be applied to any visualization that is already finally rendered, as long as it is associated with some textual description.

10.4HCOct 10, 2021

Graph Models for Biological Pathway Visualization: State of the Art and Future Challenges

Hsiang-Yun Wu, Martin Nöllenburg, Ivan Viola

The concept of multilayer networks has become recently integrated into complex systems modeling since it encapsulates a very general concept of complex relationships. Biological pathways are an example of complex real-world networks, where vertices represent biological entities, and edges indicate the underlying connectivity. For this reason, using multilayer networks to model biological knowledge allows us to formally cover essential properties and theories in the field, which also raises challenges in visualization. This is because, in the early days of pathway visualization research, only restricted types of graphs, such as simple graphs, clustered graphs, and others were adopted. In this paper, we revisit a heterogeneous definition of biological networks and aim to provide an overview to see the gaps between data modeling and visual representation. The contribution will, therefore, lie in providing guidelines and challenges of using multilayer networks as a unified data structure for the biological pathway visualization.

11.1HCNov 4, 2020

Molecumentary: Scalable Narrated Documentaries Using Molecular Visualization

David Kouřil, Ondřej Strnad, Peter Mindek et al.

We present a method for producing documentary-style content using real-time scientific visualization. We produce molecumentaries, i.e., molecular documentaries featuring structural models from molecular biology. We employ scalable methods instead of the rigid traditional production pipeline. Our method is motivated by the rapid evolution of interactive scientific visualization, which shows great potential in science dissemination. Without some form of explanation or guidance, however, novices and lay-persons often find it difficult to gain insights from the visualization itself. We integrate such knowledge using the verbal channel and provide it along an engaging visual presentation. To realize the synthesis of a molecumentary, we provide technical solutions along two major production steps: 1) preparing a story structure and 2) turning the story into a concrete narrative. In the first step, information about the model from heterogeneous sources is compiled into a story graph. Local knowledge is combined with remote sources to complete the story graph and enrich the final result. In the second step, a narrative, i.e., story elements presented in sequence, is synthesized using the story graph. We present a method for traversing the story graph and generating a virtual tour, using automated camera and visualization transitions. Texts written by domain experts are turned into verbal representations using text-to-speech functionality and provided as a commentary. Using the described framework we synthesize automatic fly-throughs with descriptions that mimic a manually authored documentary. Furthermore, we demonstrate a second scenario: guiding the documentary narrative by a textual input.

2.9CRSep 4, 2020

Homomorphic-Encrypted Volume Rendering

Sebastian Mazza, Daniel Patel, Ivan Viola

Computationally demanding tasks are typically calculated in dedicated data centers, and real-time visualizations also follow this trend. Some rendering tasks, however, require the highest level of confidentiality so that no other party, besides the owner, can read or see the sensitive data. Here we present a direct volume rendering approach that performs volume rendering directly on encrypted volume data by using the homomorphic Paillier encryption algorithm. This approach ensures that the volume data and rendered image are uninterpretable to the rendering server. Our volume rendering pipeline introduces novel approaches for encrypted-data compositing, interpolation, and opacity modulation, as well as simple transfer function design, where each of these routines maintains the highest level of privacy. We present performance and memory overhead analysis that is associated with our privacy-preserving scheme. Our approach is open and secure by design, as opposed to secure through obscurity. Owners of the data only have to keep their secure key confidential to guarantee the privacy of their volume data and the rendered images. Our work is, to our knowledge, the first privacy-preserving remote volume-rendering approach that does not require that any server involved be trustworthy; even in cases when the server is compromised, no sensitive data will be leaked to a foreign party.

1.2GRJul 29, 2019

ScaleTrotter: Illustrative Visual Travels Across Negative Scales

Sarkis Halladjian, Haichao Miao, David Kouřil et al.

We present ScaleTrotter, a conceptual framework for an interactive, multi-scale visualization of biological mesoscale data and, specifically, genome data. ScaleTrotter allows viewers to smoothly transition from the nucleus of a cell to the atomistic composition of the DNA, while bridging several orders of magnitude in scale. The challenges in creating an interactive visualization of genome data are fundamentally different in several ways from those in other domains like astronomy that require a multi-scale representation as well. First, genome data has intertwined scale levels---the DNA is an extremely long, connected molecule that manifests itself at all scale levels. Second, elements of the DNA do not disappear as one zooms out---instead the scale levels at which they are observed group these elements differently. Third, we have detailed information and thus geometry for the entire dataset and for all scale levels, posing a challenge for interactive visual exploration. Finally, the conceptual scale levels for genome data are close in scale space, requiring us to find ways to visually embed a smaller scale into a coarser one. We address these challenges by creating a new multi-scale visualization concept. We use a scale-dependent camera model that controls the visual embedding of the scales into their respective parents, the rendering of a subset of the scale hierarchy, and the location, size, and scope of the view. In traversing the scales, ScaleTrotter is roaming between 2D and 3D visual representations that are depicted in integrated visuals. We discuss, specifically, how this form of multi-scale visualization follows from the specific characteristics of the genome data and describe its implementation. Finally, we discuss the implications of our work to the general illustrative depiction of multi-scale data.

3.5HCJul 9, 2014

Illustrating Polymerization using Three-level Model Fusion

Ivan Kolesar, Julius Parulek, Ivan Viola et al.

Research in cell biology is steadily contributing new knowledge about many different aspects of physiological processes like polymerization, both with respect to the involved molecular structures as well as their related function. Illustrations of the spatio-temporal development of such processes are not only used in biomedical education, but also can serve scientists as an additional platform for in-silico experiments. In this paper, we contribute a new, three-level modeling approach to illustrate physiological processes from the class of polymerization at different time scales. We integrate physical and empirical modeling, according to which approach suits the different involved levels of detail best, and we additionally enable a simple form of interactive steering while the process is illustrated. We demonstrate the suitability of our approach in the context of several polymerization processes and report from a first evaluation with domain experts.