Model-centric Data Manifold: the Data Through the Eyes of the Model
This provides a novel geometric interpretation of how models structure data, which could aid in understanding model behavior and data representation, though it appears incremental in scope.
The paper discovered that deep ReLU neural network classifiers perceive data as a low-dimensional Riemannian manifold, with the trained dataset lying on a leaf bounded by the number of labels, and validated this with MNIST experiments showing paths on the leaf connect valid images while other leaves cover noisy ones.
We discover that deep ReLU neural network classifiers can see a low-dimensional Riemannian manifold structure on data. Such structure comes via the local data matrix, a variation of the Fisher information matrix, where the role of the model parameters is taken by the data variables. We obtain a foliation of the data domain and we show that the dataset on which the model is trained lies on a leaf, the data leaf, whose dimension is bounded by the number of classification labels. We validate our results with some experiments with the MNIST dataset: paths on the data leaf connect valid images, while other leaves cover noisy images.