Estimate of the Neural Network Dimension using Algebraic Topology and Lie Theory
This work addresses the challenge of optimizing neural network architecture for researchers and practitioners, but it is incremental as it builds on existing topological methods with specific assumptions.
The paper tackles the problem of determining the minimal number of neurons needed in a neural network layer to learn the topology of input data, using algebraic topology and Lie theory to derive precise dimension estimates and validating them on toy datasets.
In this paper we present an approach to determine the smallest possible number of neurons in a layer of a neural network in such a way that the topology of the input space can be learned sufficiently well. We introduce a general procedure based on persistent homology to investigate topological invariants of the manifold on which we suspect the data set. We specify the required dimensions precisely, assuming that there is a smooth manifold on or near which the data are located. Furthermore, we require that this space is connected and has a commutative group structure in the mathematical sense. These assumptions allow us to derive a decomposition of the underlying space whose topology is well known. We use the representatives of the $k$-dimensional homology groups from the persistence landscape to determine an integer dimension for this decomposition. This number is the dimension of the embedding that is capable of capturing the topology of the data manifold. We derive the theory and validate it experimentally on toy data sets.