ML LGJun 4

Symmetric Divergence and Normalized Similarity: A Unified Topological Framework for Representation Analysis

arXiv:2606.0634270.8

AI Analysis

For researchers analyzing neural representations, this work provides a principled topological framework that addresses limitations of existing methods like asymmetry and sample-size dependence, enabling more reliable cross-scenario benchmarking.

The paper introduces a unified topological toolkit for representation analysis, including Symmetric Representation Topology Divergence (SRTD) for fine-grained structural diagnosis and Normalized Topological Similarity (NTS) for robust, scale-invariant benchmarking. Experiments show the toolkit captures functional shifts in CNNs missed by geometric measures and robustly maps LLM genealogy even under distance saturation.

Topological Data Analysis (TDA) offers a principled, intrinsic lens for comparing neural representations. However, existing paired topological divergences (e.g., RTD) are limited by heuristic asymmetry and, more critically, unbounded scores that depend on sample size, hindering reliable cross-scenario benchmarking. To address these challenges, we develop a unified topological toolkit serving two complementary needs: fine-grained structural diagnosis and robust, standardized evaluation. First, we complete the RTD framework by introducing Symmetric Representation Topology Divergence (SRTD) and its efficient variant SRTD-lite. Beyond resolving the theoretical asymmetry of prior variants, SRTD consolidates diagnostic information into a single, comprehensive cross-barcode signature. This allows for precise localization of structural discrepancies and serves as an effective optimization objective without the overhead of dual directional computations. Second, to enable reliable benchmarking across heterogeneous settings, we propose Normalized Topological Similarity (NTS). By measuring the rank correlation of hierarchical merge orders, NTS yields a scale-invariant metric bounded between -1 and 1, effectively overcoming the scale and sample-dependence of unnormalized divergences. Experiments across synthetic and real-world deep learning settings demonstrate that our toolkit captures functional shifts in CNNs missed by geometric measures and robustly maps LLM genealogy even under distance saturation, offering a rigorous, topology-aware perspective that complements measures like CKA.

View on arXiv PDF

Similar