HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability
This addresses the interpretability challenge in deep learning for researchers and practitioners, though it is incremental as it applies existing topological tools to neural networks.
The paper tackles the problem of interpreting deep neural networks by introducing HOLE, a method that uses persistent homology to analyze topological features from neural activations, with results showing it reveals patterns related to class separation and model robustness.
Deep learning models have achieved remarkable success across various domains, yet their learned representations and decision-making processes remain largely opaque and hard to interpret. This work introduces HOLE (Homological Observation of Latent Embeddings), a method for analyzing and interpreting deep neural networks through persistent homology. HOLE extracts topological features from neural activations and presents them using a suite of visualization techniques, including Sankey diagrams, heatmaps, dendrograms, and blob graphs. These tools facilitate the examination of representation structure and quality across layers. We evaluate HOLE on standard datasets using a range of discriminative models, focusing on representation quality, interpretability across layers, and robustness to input perturbations and model compression. The results indicate that topological analysis reveals patterns associated with class separation, feature disentanglement, and model robustness, providing a complementary perspective for understanding and improving deep learning systems.