NEOct 12, 2023
How connectivity structure shapes rich and lazy learning in neural circuitsYuhan Helena Liu, Aristide Baratin, Jonathan Cornford et al. · uw
In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learning dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biology, neural circuit connectivity could exhibit a low-rank structure and therefore differs markedly from the random initializations generally used for these studies. As such, here we investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime. Through both empirical and theoretical analyses, we discover that high-rank initializations typically yield smaller network changes indicative of lazier learning, a finding we also confirm with experimentally-driven initial connectivity in recurrent neural networks. Conversely, low-rank initialization biases learning towards richer learning. Importantly, however, as an exception to this rule, we find lazier learning can still occur with a low-rank initialization that aligns with task and data statistics. Our research highlights the pivotal role of initial weight structures in shaping learning regimes, with implications for metabolic costs of plasticity and risks of catastrophic forgetting.
CDFeb 24, 2025
A Fokker-Planck-Based Loss Function that Bridges Dynamics with Density EstimationZhixin Lu, Łukasz Kuśmierz, Stefan Mihalas
We have derived a novel loss function from the Fokker-Planck equation that links dynamical system models with their probability density functions, demonstrating its utility in model identification and density estimation. In the first application, we show that this loss function can enable the extraction of dynamical parameters from non-temporal datasets, including timestamp-free measurements from steady non-equilibrium systems such as noisy Lorenz systems and gene regulatory networks. In the second application, when coupled with a density estimator, this loss facilitates density estimation when the dynamic equations are known. For density estimation, we propose a density estimator that integrates a Gaussian Mixture Model with a normalizing flow model. It simultaneously estimates normalized density, energy, and score functions from both empirical data and dynamics. It is compatible with a variety of data-based training methodologies, including maximum likelihood and score matching. It features a latent space akin to a modern Hopfield network, where the inherent Hopfield energy effectively assigns low densities to sparsely populated data regions, addressing common challenges in neural density estimators. Additionally, this Hopfield-like energy enables direct and rapid data manipulation through the Concave-Convex Procedure (CCCP) rule, facilitating tasks such as denoising and clustering. Our work demonstrates a principled framework for leveraging the complex interdependencies between dynamics and density estimation, as illustrated through synthetic examples that clarify the underlying theoretical intuitions.
NCOct 27, 2018
Convolutional neural networks with extra-classical receptive fieldsBrian Hu, Stefan Mihalas
Convolutional neural networks (CNNs) have had great success in many real-world applications and have also been used to model visual processing in the brain. However, these networks are quite brittle - small changes in the input image can dramatically change a network's output prediction. In contrast to what is known from biology, these networks largely rely on feedforward connections, ignoring the influence of recurrent connections. They also focus on supervised rather than unsupervised learning. To address these issues, we combine traditional supervised learning via backpropagation with a specialized unsupervised learning rule to learn lateral connections between neurons within a convolutional neural network. These connections have been shown to optimally integrate information from the surround, generating extra-classical receptive fields for the neurons in our new proposed model (CNNEx). Models with optimal lateral connections are more robust to noise and achieve better performance on noisy versions of the MNIST and CIFAR-10 datasets. Resistance to noise can be further improved by combining our model with additional regularization techniques such as dropout and weight decay. Although the image statistics of MNIST and CIFAR-10 differ greatly, the same unsupervised learning rule generalized to both datasets. Our results demonstrate the potential usefulness of combining supervised and unsupervised learning techniques and suggest that the integration of lateral connections into convolutional neural networks is an important area of future research.