Interpreting Neural Networks through Mahalanobis Distance
This work addresses the need for transparency in neural networks, particularly for applications requiring interpretability, but it is incremental as it builds on existing statistical concepts.
The paper tackles the problem of neural network interpretability by introducing a theoretical framework that connects linear layers with the Mahalanobis distance, offering a new perspective without empirical data.
This paper introduces a theoretical framework that connects neural network linear layers with the Mahalanobis distance, offering a new perspective on neural network interpretability. While previous studies have explored activation functions primarily for performance optimization, our work interprets these functions through statistical distance measures, a less explored area in neural network research. By establishing this connection, we provide a foundation for developing more interpretable neural network models, which is crucial for applications requiring transparency. Although this work is theoretical and does not include empirical data, the proposed distance-based interpretation has the potential to enhance model robustness, improve generalization, and provide more intuitive explanations of neural network decisions.