LGAug 29, 2023
Navigating Perplexity: A linear relationship with the data set size in t-SNE embeddingsMartin Skrodzki, Nicolas F. Chaves-de-Plaza, Thomas Höllt et al.
Widely used pipelines for analyzing high-dimensional data utilize two-dimensional visualizations. These are created, for instance, via t-distributed stochastic neighbor embedding (t-SNE). A crucial element of the t-SNE embedding procedure is the perplexity hyperparameter. That is because the embedding structure varies when perplexity is changed. A suitable perplexity choice depends on the data set and the intended usage for the embedding. Therefore, perplexity is often chosen based on heuristics, intuition, and prior experience. This paper uncovers a linear relationship between perplexity and the data set size. Namely, we show that embeddings remain structurally consistent across data set samples when perplexity is adjusted accordingly. Qualitative and quantitative experimental results support these findings. This informs the visualization process, guiding the user when choosing a perplexity value. Finally, we outline several applications for the visualization of high-dimensional data via t-SNE based on this linear relationship.
HCJan 23, 2024
Accelerating hyperbolic t-SNEMartin Skrodzki, Hunter van Geffen, Nicolas F. Chaves-de-Plaza et al.
The need to understand the structure of hierarchical or high-dimensional data is present in a variety of fields. Hyperbolic spaces have proven to be an important tool for embedding computations and analysis tasks as their non-linear nature lends itself well to tree or graph data. Subsequently, they have also been used in the visualization of high-dimensional data, where they exhibit increased embedding performance. However, none of the existing dimensionality reduction methods for embedding into hyperbolic spaces scale well with the size of the input data. That is because the embeddings are computed via iterative optimization schemes and the computation cost of every iteration is quadratic in the size of the input. Furthermore, due to the non-linear nature of hyperbolic spaces, Euclidean acceleration structures cannot directly be translated to the hyperbolic setting. This paper introduces the first acceleration structure for hyperbolic embeddings, building upon a polar quadtree. We compare our approach with existing methods and demonstrate that it computes embeddings of similar quality in significantly less time. Implementation and scripts for the experiments can be found at https://graphics.tudelft.nl/accelerating-hyperbolic-tsne.
GRJul 2, 2020
Surface Denoising based on Normal Filtering in a Robust Statistics FrameworkSunil Kumar Yadav, Martin Skrodzki, Eric Zimmermann et al.
During a surface acquisition process using 3D scanners, noise is inevitable and an important step in geometry processing is to remove these noise components from these surfaces (given as points-set or triangulated mesh). The noise-removal process (denoising) can be performed by filtering the surface normals first and by adjusting the vertex positions according to filtered normals afterwards. Therefore, in many available denoising algorithms, the computation of noise-free normals is a key factor. A variety of filters have been introduced for noise-removal from normals, with different focus points like robustness against outliers or large amplitude of noise. Although these filters are performing well in different aspects, a unified framework is missing to establish the relation between them and to provide a theoretical analysis beyond the performance of each method. In this paper, we introduce such a framework to establish relations between a number of widely-used nonlinear filters for face normals in mesh denoising and vertex normals in point set denoising. We cover robust statistical estimation with M-smoothers and their application to linear and non-linear normal filtering. Although these methods originate in different mathematical theories - which include diffusion-, bilateral-, and directional curvature-based algorithms - we demonstrate that all of them can be cast into a unified framework of robust statistics using robust error norms and their corresponding influence functions. This unification contributes to a better understanding of the individual methods and their relations with each other. Furthermore, the presented framework provides a platform for new techniques to combine the advantages of known filters and to compare them with available methods.
SEApr 28, 2020
How the deprecation of Java applets affected online visualization frameworks -- a case studyMartin Skrodzki
The JavaView visualization framework was designed at the end of the 1990s as a software that provides - among other services - easy, interactive geometry visualizations on web pages. We discuss how this and other design goals were met and present several applications to highlight the contemporary use-cases of the framework. However, as JavaView's easy web exports was based on Java Applets, the deprecation of this technology disabled one main functionality of the software. The remainder of the article uses JavaView as an example to highlight the effects of changes in the underlying programming language on a visualization toolkit. We discuss possible reactions of software to such challenges, where the JavaView framework serves as an example to illustrate development decisions. These discussions are guided by the broader, underlying question as to how long it is sensible to maintain a software.