17.5HCMay 18
Noisy Graph Patterns via Ordered MatricesJules Wulms, Wouter Meulemans, Bettina Speckmann
The high-level structure of a graph is a crucial ingredient for the analysis and visualization of relational data. However, discovering the salient graph patterns that form this structure is notoriously difficult for two reasons. (1) Finding important patterns, such as cliques and bicliques, is computationally hard. (2) Real-world graphs contain noise, and therefore do not always exhibit patterns in their pure form. Defining meaningful noisy patterns and detecting them efficiently is a currently unsolved challenge. In this paper, we propose to use well-ordered matrices as a tool to both define and effectively detect noisy patterns. Specifically, we represent a graph as its adjacency matrix and optimally order it using Moran's $I$. Standard graph patterns (cliques, bicliques, and stars) now translate to rectangular submatrices. Using Moran's $I$, we define a permitted level of noise for such patterns. A combination of exact algorithms and heuristics allows us to efficiently decompose the matrix into noisy patterns. We also introduce a novel motif simplification that visualizes noisy patterns while explicitly encoding the level of noise. We showcase our techniques on several real-world data sets.
23.4HCMay 18
Unfolding Ordered Matrices into BioFabric MotifsJules Wulms, Wouter Meulemans, Bettina Speckmann
BioFabrics were introduced by Longabaugh in 2012 as a way to draw large graphs in a clear and uncluttered manner. The visual quality of BioFabrics crucially depends on the order of vertices and edges, which can be chosen independently. Effective orders can expose salient patterns, which in turn can be summarized by motifs, allowing users to take in complex networks at-a-glance. However, so far there is no efficient layout algorithm which automatically recognizes patterns and delivers both a vertex and an edge ordering that allows these patterns to be expressed as motifs. In this paper we show how to use well-ordered matrices as a tool to efficiently find good vertex and edge orders for BioFabrics. Specifically, we order the adjacency matrix of the input graph using Moran's $I$ and detect (noisy) patterns with our recent algorithm. In this note we show how to "unfold" the ordered matrix and its patterns into a high-quality BioFabric. Our pipelines easily handles graphs with up to 250 vertices.
LGDec 8, 2022
Better Hit the Nail on the Head than Beat around the Bush: Removing Protected Attributes with a Single ProjectionPantea Haghighatkhah, Antske Fokkens, Pia Sommerauer et al.
Bias elimination and recent probing studies attempt to remove specific information from embedding spaces. Here it is important to remove as much of the target information as possible, while preserving any other information present. INLP is a popular recent method which removes specific information through iterative nullspace projections. Multiple iterations, however, increase the risk that information other than the target is negatively affected. We introduce two methods that find a single targeted projection: Mean Projection (MP, more efficient) and Tukey Median Projection (TMP, with theoretical guarantees). Our comparison between MP and INLP shows that (1) one MP projection removes linear separability based on the target and (2) MP has less impact on the overall space. Further analysis shows that applying random projections after MP leads to the same overall effects on the embedding space as the multiple projections of INLP. Applying one targeted (MP) projection hence is methodologically cleaner than applying multiple (INLP) projections that introduce random effects.
HCSep 22, 2022
Characterizing Uncertainty in the Visual Text Analysis PipelinePantea Haghighatkhah, Mennatallah El-Assady, Jean-Daniel Fekete et al.
Current visual text analysis approaches rely on sophisticated processing pipelines. Each step of such a pipeline potentially amplifies any uncertainties from the previous step. To ensure the comprehensibility and interoperability of the results, it is of paramount importance to clearly communicate the uncertainty not only of the output but also within the pipeline. In this paper, we characterize the sources of uncertainty along the visual text analysis pipeline. Within its three phases of labeling, modeling, and analysis, we identify six sources, discuss the type of uncertainty they create, and how they propagate.
HCSep 24, 2021
Simultaneous Matrix Orderings for Graph CollectionsNathan van Beusekom, Wouter Meulemans, Bettina Speckmann
Undirected graphs are frequently used to model networks. The topology of an undirected graph G can be captured by an adjacency matrix; this matrix in turn can be visualized directly to give insight into the graph structure. Which visual patterns appear in such a matrix visualization depends on the ordering of its rows and columns. Formally defining the quality of an ordering and then automatically computing a high-quality ordering are both challenging problems; however, effective heuristics exist and are used in practice. Often, graphs exist as part of a collection of graphs on the same set of vertices. To visualize such graph collections, we need a single ordering that works well for all matrices simultaneously. The current state-of-the-art solves this problem by taking a (weighted) union over all graphs and applying existing heuristics. However, this union leads to a loss of information, specifically in those parts of the graphs which are different. We propose a collection-aware approach to avoid this loss of information and apply it to two popular heuristic methods: leaf order and barycenter. The de-facto standard computational quality metrics for matrix ordering capture only block-diagonal patterns (cliques). Instead, we propose to use Moran's I, a spatial auto-correlation metric, which captures the full range of established patterns. The popular leaf order method heuristically optimizes a similar measure which supports the use of Moran's I in this context. We evaluated our methods for simultaneous orderings on real-world datasets using Moran's I as the quality metric. Our results show that our collection-aware approach matches or improves performance compared to the union approach, depending on the similarity of the graphs in the collection. Specifically, our Moran's I-based collection-aware leaf order implementation consistently outperforms other implementations.
CGMay 17, 2021
Compacting Squares: Input-Sensitive In-Place Reconfiguration of Sliding SquaresHugo A. Akitaya, Erik D. Demaine, Matias Korman et al.
A well-established theoretical model for modular robots in two dimensions are edge-connected configurations of square modules, which can reconfigure through so-called sliding moves. Dumitrescu and Pach [Graphs and Combinatorics, 2006] proved that it is always possible to reconfigure one edge-connected configuration of $n$ squares into any other using at most $O(n^2)$ sliding moves, while keeping the configuration connected at all times. For certain pairs of configurations, reconfiguration may require $Ω(n^2)$ sliding moves. However, significantly fewer moves may be sufficient. We prove that it is NP-hard to minimize the number of sliding moves for a given pair of edge-connected configurations. On the positive side we present Gather&Compact, an input-sensitive in-place algorithm that requires only $O(\bar{P} n)$ sliding moves to transform one configuration into the other, where $\bar{P}$ is the maximum perimeter of the two bounding boxes. The squares move within the bounding boxes only, with the exception of at most one square at a time which may move through the positions adjacent to the bounding boxes. The $O(\bar{P} n)$ bound never exceeds $O(n^2)$, and is optimal (up to constant factors) among all bounds parameterized by just $n$ and $\bar{P}$. Our algorithm is built on the basic principle that well-connected components of modular robots can be transformed efficiently. Hence we iteratively increase the connectivity within a configuration, to finally arrive at a single solid $xy$-monotone component. We implemented Gather&Compact and compared it experimentally to the in-place modification by Moreno and Sacristán [EuroCG 2020] of the Dumitrescu and Pach algorithm (MSDP). Our experiments show that Gather&Compact consistently outperforms MSDP by a significant margin, on all types of square configurations.
HCDec 2, 2019
Stable Visual Summaries for Trajectory CollectionsJules Wulms, Juri Buchmüller, Wouter Meulemans et al.
The availability of devices that track moving objects has led to an explosive growth in trajectory data. When exploring the resulting large trajectory collections, visual summaries are a useful tool to identify time intervals of interest. A typical approach is to represent the spatial positions of the tracked objects at each time step via a one-dimensional ordering; visualizations of such orderings can then be placed in temporal order along a time line. There are two main criteria to assess the quality of the resulting visual summary: spatial quality -- how well does the ordering capture the structure of the data at each time step, and stability -- how coherent are the orderings over consecutive time steps or temporal ranges? In this paper we introduce a new Stable Principal Component (SPC) method to compute such orderings, which is explicitly parameterized for stability, allowing a trade-off between the spatial quality and stability. We conduct extensive computational experiments that quantitatively compare the orderings produced by ours and other stable dimensionality-reduction methods to various state-of-the-art approaches using a set of well-established quality metrics that capture spatial quality and stability. We conclude that stable dimensionality reduction outperforms existing methods on stability, without sacrificing spatial quality or efficiency; in particular, our new SPC method does so at a fraction of the computational costs.