The Rashomon Effect for Visualizing High-Dimensional Data
This work addresses the problem of ambiguous and untrustworthy visualizations in high-dimensional data analysis for researchers and practitioners, offering a flexible framework that is incremental in its approach.
The paper tackles the non-uniqueness of dimension reduction embeddings by defining the Rashomon set of 'good' embeddings and introduces methods to leverage this multiplicity for more interpretable and robust visualizations, resulting in improved local structure and alignment with external concepts.
Dimension reduction (DR) is inherently non-unique: multiple embeddings can preserve the structure of high-dimensional data equally well while differing in layout or geometry. In this paper, we formally define the Rashomon set for DR -- the collection of `good' embedding -- and show how embracing this multiplicity leads to more powerful and trustworthy representations. Specifically, we pursue three goals. First, we introduce PCA-informed alignment to steer embeddings toward principal components, making axes interpretable without distorting local neighborhoods. Second, we design concept-alignment regularization that aligns an embedding dimension with external knowledge, such as class labels or user-defined concepts. Third, we propose a method to extract common knowledge across the Rashomon set by identifying trustworthy and persistent nearest-neighbor relationships, which we use to construct refined embeddings with improved local structure while preserving global relationships. By moving beyond a single embedding and leveraging the Rashomon set, we provide a flexible framework for building interpretable, robust, and goal-aligned visualizations.