Fisher information flow in artificial neural networks
This work addresses the need to improve parameter estimation in physics and other fields by leveraging ANNs, offering an incremental method to optimize training and prevent overfitting.
The authors tackled the problem of understanding how Fisher information flows through artificial neural networks (ANNs) during parameter estimation tasks, showing that optimal performance corresponds to maximal Fisher information transmission and that overfitting leads to information loss, which provides a model-free stopping criterion for training without a validation dataset.
The estimation of continuous parameters from measured data plays a central role in many fields of physics. A key tool in understanding and improving such estimation processes is the concept of Fisher information, which quantifies how information about unknown parameters propagates through a physical system and determines the ultimate limits of precision. With Artificial Neural Networks (ANNs) gradually becoming an integral part of many measurement systems, it is essential to understand how they process and transmit parameter-relevant information internally. Here, we present a method to monitor the flow of Fisher information through an ANN performing a parameter estimation task, tracking it from the input to the output layer. We show that optimal estimation performance corresponds to the maximal transmission of Fisher information, and that training beyond this point results in information loss due to overfitting. This provides a model-free stopping criterion for network training-eliminating the need for a separate validation dataset. To demonstrate the practical relevance of our approach, we apply it to a network trained on data from an imaging experiment, highlighting its effectiveness in a realistic physical setting.