Julien Tierny

LG
h-index19
24papers
626citations
Novelty41%
AI Score46

24 Papers

LGJun 27, 2022Code
Discrete Morse Sandwich: Fast Computation of Persistence Diagrams for Scalar Data -- An Algorithm and A Benchmark

Pierre Guillou, Jules Vidal, Julien Tierny

This paper introduces an efficient algorithm for persistence diagram computation, given an input piecewise linear scalar field $f$ defined on a $d$-dimensional simplicial complex $K$, with $d \leq 3$. Our work revisits the seminal algorithm "PairSimplices" [31], [103] with discrete Morse theory (DMT) [34], [80], which greatly reduces the number of input simplices to consider. Further, we also extend to DMT and accelerate the stratification strategy described in "PairSimplices" for the fast computation of the $0^{th}$ and $(d - 1)^{th}$ diagrams, noted $D_0(f)$ and $D_{d-1}(f)$. Minima-saddle persistence pairs ($D_0(f)$) and saddle-maximum persistence pairs ($D_{d-1}(f)$) are efficiently computed by processing, with a Union-Find, the unstable sets of $1$-saddles and the stable sets of $(d - 1)$-saddles. This fast pre-computation for the dimensions $0$ and $(d - 1)$ enables an aggressive specialization of [4] to the 3D case, which results in a drastic reduction of the number of input simplices for the computation of $D_1(f)$, the intermediate layer of the sandwich. Finally, we document several performance improvements via shared-memory parallelism. We provide an open-source implementation of our algorithm for reproducibility purposes. We also contribute a reproducible benchmark package, which exploits three-dimensional data from a public repository and compares our algorithm to a variety of publicly available implementations. Extensive experiments indicate that our algorithm improves by two orders of magnitude the time performance of the seminal "PairSimplices" algorithm it extends. Moreover, it also improves memory footprint and time performance over a selection of 14 competing approaches, with a substantial gain over the fastest available approaches, while producing a strictly identical output.

FLU-DYNJul 28, 2022
Topological Analysis of Ensembles of Hydrodynamic Turbulent Flows -- An Experimental Study

Florent Nauleau, Fabien Vivodtzev, Thibault Bridel-Bertomeu et al.

This application paper presents a comprehensive experimental evaluation of the suitability of Topological Data Analysis (TDA) for the quantitative comparison of turbulent flows. Specifically, our study documents the usage of the persistence diagram of the maxima of flow enstrophy (an established vorticity indicator), for the topological representation of 180 ensemble members, generated by a coarse sampling of the parameter space of five numerical solvers. We document five main hypotheses reported by domain experts, describing their expectations regarding the variability of the flows generated by the distinct solver configurations. We contribute three evaluation protocols to assess the validation of the above hypotheses by two comparison measures: (i) a standard distance used in scientific imaging (the L2 norm) and (ii) an established topological distance between persistence diagrams (the L2-Wasserstein metric). Extensive experiments on the input ensemble demonstrate the superiority of the topological distance (ii) to report as close to each other flows which are expected to be similar by domain experts, due to the configuration of their vortices. Overall, the insights reported by our study bring an experimental evidence of the suitability of TDA for representing and comparing turbulent flows, thereby providing to the fluid dynamics community confidence for its usage in future work. Also, our flow data and evaluation protocols provide to the TDA community an application-approved benchmark for the evaluation and design of further topological distances.

GRJul 22, 2022
Principal Geodesic Analysis of Merge Trees (and Persistence Diagrams)

Mathieu Pont, Jules Vidal, Julien Tierny

This paper presents a computational framework for the Principal Geodesic Analysis of merge trees (MT-PGA), a novel adaptation of the celebrated Principal Component Analysis (PCA) framework [87] to the Wasserstein metric space of merge trees [92]. We formulate MT-PGA computation as a constrained optimization problem, aiming at adjusting a basis of orthogonal geodesic axes, while minimizing a fitting energy. We introduce an efficient, iterative algorithm which exploits shared-memory parallelism, as well as an analytic expression of the fitting energy gradient, to ensure fast iterations. Our approach also trivially extends to extremum persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our approach - with MT-PGA computations in the orders of minutes for the largest examples. We show the utility of our contributions by extending to merge trees two typical PCA applications. First, we apply MT-PGA to data reduction and reliably compress merge trees by concisely representing them by their first coordinates in the MT-PGA basis. Second, we present a dimensionality reduction framework exploiting the first two directions of the MT-PGA basis to generate two-dimensional layouts of the ensemble. We augment these layouts with persistence correlation views, enabling global and local visual inspections of the feature variability in the ensemble. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used to reproduce our results.

LGJul 5, 2023
Wasserstein Auto-Encoders of Merge Trees (and Persistence Diagrams)

Mahieu Pont, Julien Tierny

This paper presents a computational framework for the Wasserstein auto-encoding of merge trees (MT-WAE), a novel extension of the classical auto-encoder neural network architecture to the Wasserstein metric space of merge trees. In contrast to traditional auto-encoders which operate on vectorized data, our formulation explicitly manipulates merge trees on their associated metric space at each layer of the network, resulting in superior accuracy and interpretability. Our novel neural network approach can be interpreted as a non-linear generalization of previous linear attempts [79] at merge tree encoding. It also trivially extends to persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our algorithms, with MT-WAE computations in the orders of minutes on average. We show the utility of our contributions in two applications adapted from previous work on merge tree encoding [79]. First, we apply MT-WAE to merge tree compression, by concisely representing them with their coordinates in the final layer of our auto-encoder. Second, we document an application to dimensionality reduction, by exploiting the latent space of our auto-encoder, for the visual analysis of ensemble data. We illustrate the versatility of our framework by introducing two penalty terms, to help preserve in the latent space both the Wasserstein distances between merge trees, as well as their clusters. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used for reproducibility.

DCOct 12, 2023
TTK is Getting MPI-Ready

Eve Le Guillou, Michael Will, Pierre Guillou et al.

This system paper documents the technical foundations for the extension of the Topology ToolKit (TTK) to distributed-memory parallelism with the Message Passing Interface (MPI). While several recent papers introduced topology-based approaches for distributed-memory environments, these were reporting experiments obtained with tailored, mono-algorithm implementations. In contrast, we describe in this paper a versatile approach (supporting both triangulated domains and regular grids) for the support of topological analysis pipelines, i.e. a sequence of topological algorithms interacting together. While developing this extension, we faced several algorithmic and software engineering challenges, which we document in this paper. We describe an MPI extension of TTK's data structure for triangulation representation and traversal, a central component to the global performance and generality of TTK's topological implementations. We also introduce an intermediate interface between TTK and MPI, both at the global pipeline level, and at the fine-grain algorithmic level. We provide a taxonomy for the distributed-memory topological algorithms supported by TTK, depending on their communication needs and provide examples of hybrid MPI+thread parallelizations. Performance analyses show that parallel efficiencies range from 20% to 80% (depending on the algorithms), and that the MPI-specific preconditioning introduced by our framework induces a negligible computation time overhead. We illustrate the new distributed-memory capabilities of TTK with an example of advanced analysis pipeline, combining multiple algorithms, run on the largest publicly available dataset we have found (120 billion vertices) on a cluster with 64 nodes (for a total of 1536 cores). Finally, we provide a roadmap for the completion of TTK's MPI extension, along with generic recommendations for each algorithm communication category.

LGApr 28, 2023
Wasserstein Dictionaries of Persistence Diagrams

Keanu Sisouk, Julie Delon, Julien Tierny

This paper presents a computational framework for the concise encoding of an ensemble of persistence diagrams, in the form of weighted Wasserstein barycenters [100], [102] of a dictionary of atom diagrams. We introduce a multi-scale gradient descent approach for the efficient resolution of the corresponding minimization problem, which interleaves the optimization of the barycenter weights with the optimization of the atom diagrams. Our approach leverages the analytic expressions for the gradient of both sub-problems to ensure fast iterations and it additionally exploits shared-memory parallelism. Extensive experiments on public ensembles demonstrate the efficiency of our approach, with Wasserstein dictionary computations in the orders of minutes for the largest examples. We show the utility of our contributions in two applications. First, we apply Wassserstein dictionaries to data reduction and reliably compress persistence diagrams by concisely representing them with their weights in the dictionary. Second, we present a dimensionality reduction framework based on a Wasserstein dictionary defined with a small number of atoms (typically three) and encode the dictionary as a low dimensional simplex embedded in a visual space (typically in 2D). In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used to reproduce our results.

LGJul 17, 2024
A Practical Solver for Scalar Data Topological Simplification

Mohamed Kissi, Mathieu Pont, Joshua A. Levine et al.

This paper presents a practical approach for the optimization of topological simplification, a central pre-processing step for the analysis and visualization of scalar data. Given an input scalar field f and a set of "signal" persistence pairs to maintain, our approach produces an output field g that is close to f and which optimizes (i) the cancellation of "non-signal" pairs, while (ii) preserving the "signal" pairs. In contrast to pre-existing simplification algorithms, our approach is not restricted to persistence pairs involving extrema and can thus address a larger class of topological features, in particular saddle pairs in three-dimensional scalar data. Our approach leverages recent generic persistence optimization frameworks and extends them with tailored accelerations specific to the problem of topological simplification. Extensive experiments report substantial accelerations over these frameworks, thereby making topological simplification optimization practical for real-life datasets. Our approach enables a direct visualization and analysis of the topologically simplified data, e.g., via isosurfaces of simplified topology (fewer components and handles). We apply our approach to the extraction of prominent filament structures in three-dimensional data. Specifically, we show that our pre-simplification of the data leads to practical improvements over standard topological techniques for removing filament loops. We also show how our approach can be used to repair genus defects in surface processing. Finally, we provide a C++ implementation for reproducibility purposes.

CGDec 15, 2025Code
Continuous Edit Distance, Geodesics and Barycenters of Time-varying Persistence Diagrams

Sebastien Tchitchek, Mohamed Kissi, Julien Tierny

We introduce the Continuous Edit Distance (CED), a geodesic and elastic distance for time-varying persistence diagrams (TVPDs). The CED extends edit-distance ideas to TVPDs by combining local substitution costs with penalized deletions/insertions, controlled by two parameters: \(α\) (trade-off between temporal misalignment and diagram discrepancy) and \(β\) (gap penalty). We also provide an explicit construction of CED-geodesics. Building on these ingredients, we present two practical barycenter solvers, one stochastic and one greedy, that monotonically decrease the CED Frechet energy. Empirically, the CED is robust to additive perturbations (both temporal and spatial), recovers temporal shifts, and supports temporal pattern search. On real-life datasets, the CED achieves clustering performance comparable or better than standard elastic dissimilarities, while our clustering based on CED-barycenters yields superior classification results. Overall, the CED equips TVPD analysis with a principled distance, interpretable geodesics, and practical barycenters, enabling alignment, comparison, averaging, and clustering directly in the space of TVPDs. A C++ implementation is provided for reproducibility at the following address https://github.com/sebastien-tchitchek/ContinuousEditDistance.

FLU-DYNAug 21, 2024
Identifying Locally Turbulent Vortices within Instabilities

Fabien Vivodtzev, Florent Nauleau, Jean-Philippe Braeunig et al.

This work presents an approach for the automatic detection of locally turbulent vortices within turbulent 2D flows such as instabilites. First, given a time step of the flow, methods from Topological Data Analysis (TDA) are leveraged to extract the geometry of the vortices. Specifically, the enstrophy of the flow is simplified by topological persistence, and the vortices are extracted by collecting the basins of the simplified enstrophy's Morse complex. Next, the local kinetic energy power spectrum is computed for each vortex. We introduce a set of indicators based on the kinetic energy power spectrum to estimate the correlation between the vortex's behavior and that of an idealized turbulent vortex. Our preliminary experiments show the relevance of these indicators for distinguishing vortices which are turbulent from those which have not yet reached a turbulent state and thus known as laminar.

CGFeb 27, 2025Code
Topological Autoencoders++: Fast and Accurate Cycle-Aware Dimensionality Reduction

Mattéo Clémot, Julie Digne, Julien Tierny

This paper presents a novel topology-aware dimensionality reduction approach aiming at accurately visualizing the cyclic patterns present in high dimensional data. To that end, we build on the Topological Autoencoders (TopoAE) formulation. First, we provide a novel theoretical analysis of its associated loss and show that a zero loss indeed induces identical persistence pairs (in high and low dimensions) for the $0$-dimensional persistent homology (PH$^0$) of the Rips filtration. We also provide a counter example showing that this property no longer holds for a naive extension of TopoAE to PH$^d$ for $d\ge 1$. Based on this observation, we introduce a novel generalization of TopoAE to $1$-dimensional persistent homology (PH$^1$), called TopoAE++, for the accurate generation of cycle-aware planar embeddings, addressing the above failure case. This generalization is based on the notion of cascade distortion, a new penalty term favoring an isometric embedding of the $2$-chains filling persistent $1$-cycles, hence resulting in more faithful geometrical reconstructions of the $1$-cycles in the plane. We further introduce a novel, fast algorithm for the exact computation of PH for Rips filtrations in the plane, yielding improved runtimes over previously documented topology-aware methods. Our method also achieves a better balance between the topological accuracy, as measured by the Wasserstein distance, and the visual preservation of the cycles in low dimensions. Our C++ implementation is available at https://github.com/MClemot/TopologicalAutoencodersPlusPlus.

LGSep 18, 2025Code
Robust Barycenters of Persistence Diagrams

Keanu Sisouk, Eloi Tanguy, Julie Delon et al.

This short paper presents a general approach for computing robust Wasserstein barycenters of persistence diagrams. The classical method consists in computing assignment arithmetic means after finding the optimal transport plans between the barycenter and the persistence diagrams. However, this procedure only works for the transportation cost related to the $q$-Wasserstein distance $W_q$ when $q=2$. We adapt an alternative fixed-point method to compute a barycenter diagram for generic transportation costs ($q > 1$), in particular those robust to outliers, $q \in (1,2)$. We show the utility of our work in two applications: \emph{(i)} the clustering of persistence diagrams on their metric space and \emph{(ii)} the dictionary encoding of persistence diagrams. In both scenarios, we demonstrate the added robustness to outliers provided by our generalized framework. Our Python implementation is available at this address: https://github.com/Keanu-Sisouk/RobustBarycenter .

LGApr 4, 2025Code
BondMatcher: H-Bond Stability Analysis in Molecular Systems

Thomas Daniel, Malgorzata Olejniczak, Julien Tierny

This application paper investigates the stability of hydrogen bonds (H-bonds), as characterized by the Quantum Theory of Atoms in Molecules (QTAIM). First, we contribute a database of 4544 electron densities associated to four isomers of water hexamers (the so-called Ring, Book, Cage and Prism), generated by distorting their equilibrium geometry under various structural perturbations, modeling the natural dynamic behavior of molecular systems. Second, we present a new stability measure, called bond occurrence rate, associating each bond path present at equilibrium with its rate of occurrence within the input ensemble. We also provide an algorithm, called BondMatcher, for its automatic computation, based on a tailored, geometry-aware partial isomorphism estimation between the extremum graphs of the considered electron densities. Our new stability measure allows for the automatic identification of densities lacking H-bond paths, enabling further visual inspections. Specifically, the topological analysis enabled by our framework corroborates experimental observations and provides refined geometrical criteria for characterizing the disappearance of H-bond paths. Our electron density database and our C++ implementation are available at this address: https://github.com/thom-dani/BondMatcher.

DMJun 21, 2018Code
Topological Data Analysis Made Easy with the Topology ToolKit

Guillaume Favelier, Charles Gueunet, Attila Gyulassy et al.

This tutorial presents topological methods for the analysis and visualization of scientific data from a user's perspective, with the Topology ToolKit (TTK), a recently released open-source library for topological data analysis. Topological methods have gained considerably in popularity and maturity over the last twenty years and success stories of established methods have been documented in a wide range of applications (combustion, chemistry, astrophysics, material sciences, etc.) with both acquired and simulated data, in both post-hoc and in-situ contexts. While reference textbooks have been published on the topic, no tutorial at IEEE VIS has covered this area in recent years, and never at a software level and from a user's point-of-view. This tutorial fills this gap by providing a beginner's introduction to topological methods for practitioners, researchers, students, and lecturers. In particular, instead of focusing on theoretical aspects and algorithmic details, this tutorial focuses on how topological methods can be useful in practice for concrete data analysis tasks such as segmentation, feature extraction or tracking. The tutorial describes in detail how to achieve these tasks with TTK. First, after an introduction to topological methods and their application in data analysis, a brief overview of TTK's main entry point for end users, namely ParaView, will be presented. Second, an overview of TTK's main features will be given. A running example will be described in detail, showcasing how to access TTK's features via ParaView, Python, VTK/C++, and C++. Third, hands-on sessions will concretely show how to use TTK in ParaView for multiple, representative data analysis tasks. Fourth, the usage of TTK will be presented for developers, in particular by describing several examples of visualization and data analysis projects that were built on top of TTK. Finally, some feedback regarding the usage of TTK as a teaching platform for topological analysis will be given. Presenters of this tutorial include experts in topological methods, core authors of TTK as well as active users, coming from academia, labs, or industry. A large part of the tutorial will be dedicated to hands-on exercises and a rich material package (including TTK pre-installs in virtual machines, code, data, demos, video tutorials, etc.) will be provided to the participants. This tutorial mostly targets students, practitioners and researchers who are not experts in topological methods but who are interested in using them in their daily tasks. We also target researchers already familiar to topological methods and who are interested in using or contributing to TTK.

GRMay 22, 2018Code
The Topology ToolKit

Julien Tierny, Guillaume Favelier, Joshua A. Levine et al.

This system paper presents the Topology ToolKit (TTK), a software platform designed for topological data analysis in scientific visualization. TTK provides a unified, generic, efficient, and robust implementation of key algorithms for the topological analysis of scalar data, including: critical points, integral lines, persistence diagrams, persistence curves, merge trees, contour trees, Morse-Smale complexes, fiber surfaces, continuous scatterplots, Jacobi sets, Reeb spaces, and more. TTK is easily accessible to end users due to a tight integration with ParaView. It is also easily accessible to developers through a variety of bindings (Python, VTK/C++) for fast prototyping or through direct, dependence-free, C++, to ease integration into pre-existing complex systems. While developing TTK, we faced several algorithmic and software engineering challenges, which we document in this paper. In particular, we present an algorithm for the construction of a discrete gradient that complies to the critical points extracted in the piecewise-linear setting. This algorithm guarantees a combinatorial consistency across the topological abstractions supported by TTK, and importantly, a unified implementation of topological data simplification for multi-scale exploration and analysis. We also present a cached triangulation data structure, that supports time efficient and generic traversals, which self-adjusts its memory usage on demand for input simplicial meshes and which implicitly emulates a triangulation for regular grids with no memory overhead. Finally, we describe an original software architecture, which guarantees memory efficient and direct accesses to TTK features, while still allowing for researchers powerful and easy bindings and extensions. TTK is open source (BSD license) and its code, online documentation and video tutorials are available on TTK's website.

LGFeb 4, 2025
A User's Guide to Sampling Strategies for Sliced Optimal Transport

Keanu Sisouk, Julie Delon, Julien Tierny

This paper serves as a user's guide to sampling strategies for sliced optimal transport. We provide reminders and additional regularity results on the Sliced Wasserstein distance. We detail the construction methods, generation time complexity, theoretical guarantees, and conditions for each strategy. Additionally, we provide insights into their suitability for sliced optimal transport in theory. Extensive experiments on both simulated and real-world data offer a representative comparison of the strategies, culminating in practical recommendations for their best usage.

LGAug 25, 2025
Topology Aware Neural Interpolation of Scalar Fields

Mohamed Kissi, Keanu Sisouk, Joshua A. Levine et al.

This paper presents a neural scheme for the topology-aware interpolation of time-varying scalar fields. Given a time-varying sequence of persistence diagrams, along with a sparse temporal sampling of the corresponding scalar fields, denoted as keyframes, our interpolation approach aims at "inverting" the non-keyframe diagrams to produce plausible estimations of the corresponding, missing data. For this, we rely on a neural architecture which learns the relation from a time value to the corresponding scalar field, based on the keyframe examples, and reliably extends this relation to the non-keyframe time steps. We show how augmenting this architecture with specific topological losses exploiting the input diagrams both improves the geometrical and topological reconstruction of the non-keyframe time steps. At query time, given an input time value for which an interpolation is desired, our approach instantaneously produces an output, via a single propagation of the time input through the network. Experiments interpolating 2D and 3D time-varying datasets show our approach superiority, both in terms of data and topological fitting, with regard to reference interpolation schemes.

GRJul 16, 2021
Wasserstein Distances, Geodesics and Barycenters of Merge Trees

Mathieu Pont, Jules Vidal, Julie Delon et al.

This paper presents a unified computational framework for the estimation of distances, geodesics and barycenters of merge trees. We extend recent work on the edit distance [106] and introduce a new metric, called the Wasserstein distance between merge trees, which is purposely designed to enable efficient computations of geodesics and barycenters. Specifically, our new distance is strictly equivalent to the L2-Wasserstein distance between extremum persistence diagrams, but it is restricted to a smaller solution space, namely, the space of rooted partial isomorphisms between branch decomposition trees. This enables a simple extension of existing optimization frameworks [112] for geodesics and barycenters from persistence diagrams to merge trees. We introduce a task-based algorithm which can be generically applied to distance, geodesic, barycenter or cluster computation. The task-based nature of our approach enables further accelerations with shared-memory parallelism. Extensive experiments on public ensembles and SciVis contest benchmarks demonstrate the efficiency of our approach -- with barycenter computations in the orders of minutes for the largest examples -- as well as its qualitative ability to generate representative barycenter merge trees, visually summarizing the features of interest found in the ensemble. We show the utility of our contributions with dedicated visualization applications: feature tracking, temporal reduction and ensemble clustering. We provide a lightweight C++ implementation that can be used to reproduce our results.

GRSep 3, 2020
TopoMap: A 0-dimensional Homology Preserving Projection of High-Dimensional Data

Harish Doraiswamy, Julien Tierny, Paulo J. S. Silva et al.

Multidimensional Projection is a fundamental tool for high-dimensional data analytics and visualization. With very few exceptions, projection techniques are designed to map data from a high-dimensional space to a visual space so as to preserve some dissimilarity (similarity) measure, such as the Euclidean distance for example. In fact, although adopting distinct mathematical formulations designed to favor different aspects of the data, most multidimensional projection methods strive to preserve dissimilarity measures that encapsulate geometric properties such as distances or the proximity relation between data objects. However, geometric relations are not the only interesting property to be preserved in a projection. For instance, the analysis of particular structures such as clusters and outliers could be more reliably performed if the mapping process gives some guarantee as to topological invariants such as connected components and loops. This paper introduces TopoMap, a novel projection technique which provides topological guarantees during the mapping process. In particular, the proposed method performs the mapping from a high-dimensional space to a visual space, while preserving the 0-dimensional persistence diagram of the Rips filtration of the high-dimensional data, ensuring that the filtrations generate the same connected components when applied to the original as well as projected data. The presented case studies show that the topological guarantee provided by TopoMap not only brings confidence to the visual analytic process but also can be used to assist in the assessment of other projection methods.

GROct 17, 2019
Statistical Parameter Selection for Clustering Persistence Diagrams

Max Kontak, Jules Vidal, Julien Tierny

In urgent decision making applications, ensemble simulations are an important way to determine different outcome scenarios based on currently available data. In this paper, we will analyze the output of ensemble simulations by considering so-called persistence diagrams, which are reduced representations of the original data, motivated by the extraction of topological features. Based on a recently published progressive algorithm for the clustering of persistence diagrams, we determine the optimal number of clusters, and therefore the number of significantly different outcome scenarios, by the minimization of established statistical score functions. Furthermore, we present a proof-of-concept prototype implementation of the statistical selection of the number of clusters and provide the results of an experimental study, where this implementation has been applied to real-world ensemble data sets.

GEO-PHAug 20, 2019
Ranking Viscous Finger Simulations to an Acquired Ground Truth with Topology-aware Matchings

Maxime Soler, Martin Petitfrere, Gilles Darche et al.

This application paper presents a novel framework based on topological data analysis for the automatic evaluation and ranking of viscous finger simulation runs in an ensemble with respect to a reference acquisition. Individual fingers in a given time-step are associated with critical point pairs in the distance field to the injection point, forming persistence diagrams. Different metrics, based on optimal transport, for comparing time-varying persistence diagrams in this specific applicative case are introduced. We evaluate the relevance of the rankings obtained with these metrics, both qualitatively thanks to a lightweight web visual interface, and quantitatively by studying the deviation from a reference ranking suggested by experts. Extensive experiments show the quantitative superiority of our approach compared to traditional alternatives. Our web interface allows experts to conveniently explore the produced rankings. We show a complete viscous fingering case study demonstrating the utility of our approach in the context of porous media fluid flow, where our framework can be used to automatically discard physically-irrelevant simulation runs from the ensemble and rank the most plausible ones. We document an in-situ implementation to lighten I/O and performance constraints arising in the context of parametric studies.

GRJul 10, 2019
Progressive Wasserstein Barycenters of Persistence Diagrams

Jules Vidal, Joseph Budin, Julien Tierny

This paper presents an efficient algorithm for the progressive approximation of Wasserstein barycenters of persistence diagrams, with applications to the visual analysis of ensemble data. Given a set of scalar fields, our approach enables the computation of a persistence diagram which is representative of the set, and which visually conveys the number, data ranges and saliences of the main features of interest found in the set. Such representative diagrams are obtained by computing explicitly the discrete Wasserstein barycenter of the set of persistence diagrams, a notoriously computationally intensive task. In particular, we revisit efficient algorithms for Wasserstein distance approximation [12,51] to extend previous work on barycenter estimation [94]. We present a new fast algorithm, which progressively approximates the barycenter by iteratively increasing the computation accuracy as well as the number of persistent features in the output diagram. Such a progressivity drastically improves convergence in practice and allows to design an interruptible algorithm, capable of respecting computation time constraints. This enables the approximation of Wasserstein barycenters within interactive times. We present an application to ensemble clustering where we revisit the k-means algorithm to exploit our barycenters and compute, within execution time constraints, meaningful clusters of ensemble data along with their barycenter diagram. Extensive experiments on synthetic and real-life data sets report that our algorithm converges to barycenters that are qualitatively meaningful with regard to the applications, and quantitatively comparable to previous techniques, while offering an order of magnitude speedup when run until convergence (without time constraint). Our algorithm can be trivially parallelized to provide additional speedups in practice on standard workstations. [...]

IVAug 17, 2018
Lifted Wasserstein Matcher for Fast and Robust Topology Tracking

Maxime Soler, Mélanie Plainchault, Bruno Conche et al.

This paper presents a robust and efficient method for tracking topological features in time-varying scalar data. Structures are tracked based on the optimal matching between persistence diagrams with respect to the Wasserstein metric. This fundamentally relies on solving the assignment problem, a special case of optimal transport, for all consecutive timesteps. Our approach relies on two main contributions. First, we revisit the seminal assignment algorithm by Kuhn and Munkres which we specifically adapt to the problem of matching persistence diagrams in an efficient way. Second, we propose an extension of the Wasserstein metric that significantly improves the geometrical stability of the matching of domain-embedded persistence pairs. We show that this geometrical lifting has the additional positive side-effect of improving the assignment matrix sparsity and therefore computing time. The global framework implements a coarse-grained parallelism by computing persistence diagrams and finding optimal matchings in parallel for every couple of consecutive timesteps. Critical trajectories are constructed by associating successively matched persistence pairs over time. Merging and splitting events are detected with a geometrical threshold in a post-processing stage. Extensive experiments on real-life datasets show that our matching approach is an order of magnitude faster than the seminal Munkres algorithm. Moreover, compared to a modern approximation method, our method provides competitive runtimes while yielding exact results. We demonstrate the utility of our global framework by extracting critical point trajectories from various simulated time-varying datasets and compare it to the existing methods based on associated overlaps of volumes. Robustness to noise and temporal resolution downsampling is empirically demonstrated.

GRJul 30, 2018
Persistence Atlas for Critical Point Variability in Ensembles

Guillaume Favelier, Noura Faraj, Brian Summa et al.

This paper presents a new approach for the visualization and analysis of the spatial variability of features of interest represented by critical points in ensemble data. Our framework, called Persistence Atlas, enables the visualization of the dominant spatial patterns of critical points, along with statistics regarding their occurrence in the ensemble. The persistence atlas represents in the geometrical domain each dominant pattern in the form of a confidence map for the appearance of critical points. As a by-product, our method also provides 2-dimensional layouts of the entire ensemble, highlighting the main trends at a global level. Our approach is based on the new notion of Persistence Map, a measure of the geometrical density in critical points which leverages the robustness to noise of topological persistence to better emphasize salient features. We show how to leverage spectral embedding to represent the ensemble members as points in a low-dimensional Euclidean space, where distances between points measure the dissimilarities between critical point layouts and where statistical tasks, such as clustering, can be easily carried out. Further, we show how the notion of mandatory critical point can be leveraged to evaluate for each cluster confidence regions for the appearance of critical points. Most of the steps of this framework can be trivially parallelized and we show how to efficiently implement them. Extensive experiments demonstrate the relevance of our approach. The accuracy of the confidence regions provided by the persistence atlas is quantitatively evaluated and compared to a baseline strategy using an off-the-shelf clustering approach. We illustrate the importance of the persistence atlas in a variety of real-life datasets, where clear trends in feature layouts are identified and analyzed.

IVFeb 8, 2018
Topologically Controlled Lossy Compression

Maxime Soler, Melanie Plainchault, Bruno Conche et al.

This paper presents a new algorithm for the lossy compression of scalar data defined on 2D or 3D regular grids, with topological control. Certain techniques allow users to control the pointwise error induced by the compression. However, in many scenarios it is desirable to control in a similar way the preservation of higher-level notions, such as topological features , in order to provide guarantees on the outcome of post-hoc data analyses. This paper presents the first compression technique for scalar data which supports a strictly controlled loss of topological features. It provides users with specific guarantees both on the preservation of the important features and on the size of the smaller features destroyed during compression. In particular, we present a simple compression strategy based on a topologically adaptive quantization of the range. Our algorithm provides strong guarantees on the bottleneck distance between persistence diagrams of the input and decompressed data, specifically those associated with extrema. A simple extension of our strategy additionally enables a control on the pointwise error. We also show how to combine our approach with state-of-the-art compressors, to further improve the geometrical reconstruction. Extensive experiments, for comparable compression rates, demonstrate the superiority of our algorithm in terms of the preservation of topological features. We show the utility of our approach by illustrating the compatibility between the output of post-hoc topological data analysis pipelines, executed on the input and decompressed data, for simulated or acquired data sets. We also provide a lightweight VTK-based C++ implementation of our approach for reproduction purposes.