Manuela Waldner

HC
h-index1
8papers
26citations
Novelty38%
AI Score38

8 Papers

10.5GRMar 11
TreeON: Reconstructing 3D Tree Point Clouds from Orthophotos and Heightmaps

Angeliki Grammatikaki, Johannes Eschner, Pedro Hermosilla et al.

We present TreeON, a novel neural-based framework for reconstructing detailed 3D tree point clouds from sparse top-down geodata, using only a single orthophoto and its corresponding Digital Surface Model (DSM). Our method introduces a new training supervision strategy that combines both geometric supervision and differentiable shadow and silhouette losses to learn point cloud representations of trees without requiring species labels, procedural rules, terrestrial reconstruction data, or ground laser scans. To address the lack of ground truth data, we generate a synthetic dataset of point clouds from procedurally modeled trees and train our network on it. Quantitative and qualitative experiments demonstrate better reconstruction quality and coverage compared to existing methods, as well as strong generalization to real-world data, producing visually appealing and structurally plausible tree point cloud representations suitable for integration into interactive digital 3D maps. The codebase, synthetic dataset, and pretrained model are publicly available at https://angelikigram.github.io/treeON/.

HCApr 28, 2025
Interactive Discovery and Exploration of Visual Bias in Generative Text-to-Image Models

Johannes Eschner, Roberto Labadie-Tamayo, Matthias Zeppelzauer et al.

Bias in generative Text-to-Image (T2I) models is a known issue, yet systematically analyzing such models' outputs to uncover it remains challenging. We introduce the Visual Bias Explorer (ViBEx) to interactively explore the output space of T2I models to support the discovery of visual bias. ViBEx introduces a novel flexible prompting tree interface in combination with zero-shot bias probing using CLIP for quick and approximate bias exploration. It additionally supports in-depth confirmatory bias analysis through visual inspection of forward, intersectional, and inverse bias queries. ViBEx is model-agnostic and publicly available. In four case study interviews, experts in AI and ethics were able to discover visual biases that have so far not been described in literature.

CVOct 22, 2025
A Matter of Time: Revealing the Structure of Time in Vision-Language Models

Nidham Tekaya, Manuela Waldner, Matthias Zeppelzauer

Large-scale vision-language models (VLMs) such as CLIP have gained popularity for their generalizable and expressive multimodal representations. By leveraging large-scale training data with diverse textual metadata, VLMs acquire open-vocabulary capabilities, solving tasks beyond their training scope. This paper investigates the temporal awareness of VLMs, assessing their ability to position visual content in time. We introduce TIME10k, a benchmark dataset of over 10,000 images with temporal ground truth, and evaluate the time-awareness of 37 VLMs by a novel methodology. Our investigation reveals that temporal information is structured along a low-dimensional, non-linear manifold in the VLM embedding space. Based on this insight, we propose methods to derive an explicit ``timeline'' representation from the embedding space. These representations model time and its chronological progression and thereby facilitate temporal reasoning tasks. Our timeline approaches achieve competitive to superior accuracy compared to a prompt-based baseline while being computationally efficient. All code and data are available at https://tekayanidham.github.io/timeline-page/.

CVOct 14, 2021
Interactive Analysis of CNN Robustness

Stefan Sietzen, Mathias Lechner, Judy Borowski et al.

While convolutional neural networks (CNNs) have found wide adoption as state-of-the-art models for image-related tasks, their predictions are often highly sensitive to small input perturbations, which the human vision is robust against. This paper presents Perturber, a web-based application that allows users to instantaneously explore how CNN activations and predictions evolve when a 3D input scene is interactively perturbed. Perturber offers a large variety of scene modifications, such as camera controls, lighting and shading effects, background modifications, object morphing, as well as adversarial attacks, to facilitate the discovery of potential vulnerabilities. Fine-tuned model versions can be directly compared for qualitative evaluation of their robustness. Case studies with machine learning experts have shown that Perturber helps users to quickly generate hypotheses about model vulnerabilities and to qualitatively compare model behavior. Using quantitative analyses, we could replicate users' insights with other CNN architectures and input images, yielding new insights about the vulnerability of adversarially trained models.

HCOct 14, 2021
Does the Layout Really Matter? A Study on Visual Model Accuracy Estimation

Nicolas Grossmann, Jürgen Bernard, Michael Sedlmair et al.

In visual interactive labeling, users iteratively assign labels to data items until the machine model reaches an acceptable accuracy. A crucial step of this process is to inspect the model's accuracy and decide whether it is necessary to label additional elements. In scenarios with no or very little labeled data, visual inspection of the predictions is required. Similarity-preserving scatterplots created through a dimensionality reduction algorithm are a common visualization that is used in these cases. Previous studies investigated the effects of layout and image complexity on tasks like labeling. However, model evaluation has not been studied systematically. We present the results of an experiment studying the influence of image complexity and visual grouping of images on model accuracy estimation. We found that users outperform traditional automated approaches when estimating a model's accuracy. Furthermore, while the complexity of images impacts the overall performance, the layout of the items in the plot has little to no effect on estimations.

HCOct 15, 2019
Immersive Analytics of Large Dynamic Networks via Overview and Detail Navigation

Johannes Sorger, Manuela Waldner, Wolfgang Knecht et al.

Analysis of large dynamic networks is a thriving research field, typically relying on 2D graph representations. The advent of affordable head mounted displays however, sparked new interest in the potential of 3D visualization for immersive network analytics. Nevertheless, most solutions do not scale well with the number of nodes and edges and rely on conventional fly- or walk-through navigation. In this paper, we present a novel approach for the exploration of large dynamic graphs in virtual reality that interweaves two navigation metaphors: overview exploration and immersive detail analysis. We thereby use the potential of state-of-the-art VR headsets, coupled with a web-based 3D rendering engine that supports heterogeneous input modalities to enable ad-hoc immersive network analytics. We validate our approach through a performance evaluation and a case study with experts analyzing a co-morbidity network.

HCSep 2, 2019
Collecting and Structuring Information in the Information Collage

Sebastian Sippl, Michael Sedlmair, Manuela Waldner

Knowledge workers, such as scientists, journalists, or consultants, adaptively seek, gather, and consume information. These processes are often inefficient as existing user interfaces provide limited possibilities to combine information from various sources and different formats into a common knowledge representation. In this paper, we present the concept of an information collage (IC) -- a web browser extension combining manual spatial organization of gathered information fragments and automatic text analysis for interactive content exploration and expressive visual summaries. We used IC for case studies with knowledge workers from different domains and longer-term field studies over a period of one month. We identified three different ways how users collect and structure information and provide design recommendations how to support these observed usage strategies.

HCApr 12, 2017
How Sensemaking Tools Influence Display Space Usage

Thomas Geymayer, Manuela Waldner, Alexander Lex et al.

We explore how the availability of a sensemaking tool influences users' knowledge externalization strategies. On a large display, users were asked to solve an intelligence analysis task with or without a bidirectionally linked concept-graph (BLC) to organize insights into concepts (nodes) and relations (edges). In BLC, both nodes and edges maintain "deep links" to the exact source phrases and sections in associated documents. In our control condition, we were able to reproduce previously described spatial organization behaviors using document windows on the large display. When using BLC, however, we found that analysts apply spatial organization to BLC nodes instead, use significantly less display space and have significantly fewer open windows.