LGNERTSep 20, 2024

Hidden Activations Are Not Enough: A General Approach to Neural Network Predictions

arXiv:2409.13163v1h-index: 1Has Code
Originality Highly original
AI Analysis

This provides a general, architecture-agnostic approach for understanding neural network behavior, potentially aiding in robustness and interpretability for machine learning practitioners.

The paper tackles the problem of analyzing neural network predictions by introducing a mathematical framework using quiver representation theory to quantify similarity between new data and training data, with proof-of-concept experiments on MNIST and FashionMNIST datasets for detecting adversarial examples on MLP architectures.

We introduce a novel mathematical framework for analyzing neural networks using tools from quiver representation theory. This framework enables us to quantify the similarity between a new data sample and the training data, as perceived by the neural network. By leveraging the induced quiver representation of a data sample, we capture more information than traditional hidden layer outputs. This quiver representation abstracts away the complexity of the computations of the forward pass into a single matrix, allowing us to employ simple geometric and statistical arguments in a matrix space to study neural network predictions. Our mathematical results are architecture-agnostic and task-agnostic, making them broadly applicable. As proof of concept experiments, we apply our results for the MNIST and FashionMNIST datasets on the problem of detecting adversarial examples on different MLP architectures and several adversarial attack methods. Our experiments can be reproduced with our \href{https://github.com/MarcoArmenta/Hidden-Activations-are-not-Enough}{publicly available repository}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes