LGJul 26, 2024Code
Advancing Neural Network Performance through Emergence-Promoting Initialization SchemeJohnny Jingze Li, Vivek Kurien George, Gabriel A. Silva
Emergence in machine learning refers to the spontaneous appearance of complex behaviors or capabilities that arise from the scale and structure of training data and model architectures, despite not being explicitly programmed. We introduce a novel yet straightforward neural network initialization scheme that aims at achieving greater potential for emergence. Measuring emergence as a kind of structural nonlinearity, our method adjusts the layer-wise weight scaling factors to achieve higher emergence values. This enhancement is easy to implement, requiring no additional optimization steps for initialization compared to GradInit. We evaluate our approach across various architectures, including MLP and convolutional architectures for image recognition and transformers for machine translation. We demonstrate substantial improvements in both model accuracy and training speed, with and without batch normalization. The simplicity, theoretical innovation, and demonstrable empirical advantages of our method make it a potent enhancement to neural network initialization practices. These results suggest a promising direction for leveraging emergence to improve neural network training methodologies. Code is available at: https://github.com/johnnyjingzeli/EmergenceInit.
MATH-PHApr 28, 2010
Computational efficiency of fractional diffusion using adaptive time step memoryBrian P. Sprouse, Christopher L. MacDonald, Gabriel A. Silva
Numerical solutions to fractional differential equations can be extremely computationally intensive due to the effect of non-local derivatives in which all previous time points contribute to the current iteration. In finite difference methods this has been approximated using the 'short memory effect' where it is assumed that previous events prior to some certain time point are insignificant and thus not calculated. Here we present an adaptive time method for smooth functions that is computationally efficient and results in smaller errors during numerical simulations. Sampled points along the system's history at progressively longer intervals are assumed to reflect the values of neighboring time points. By including progressively fewer points as a function of time, a temporally 'weighted' history is computed that includes contributions from the entire past of the system, resulting in increased accuracy, but with fewer points actually calculated, which ensures computational efficiency.
LGSep 3, 2024
Quantifying Emergence in Neural Networks: Insights from Pruning and Training DynamicsFaisal AlShinaifi, Zeyad Almoaigel, Johnny Jingze Li et al.
Emergence, where complex behaviors develop from the interactions of simpler components within a network, plays a crucial role in enhancing neural network capabilities. We introduce a quantitative framework to measure emergence during the training process and examine its impact on network performance, particularly in relation to pruning and training dynamics. Our hypothesis posits that the degree of emergence, defined by the connectivity between active and inactive nodes, can predict the development of emergent behaviors in the network. Through experiments with feedforward and convolutional architectures on benchmark datasets, we demonstrate that higher emergence correlates with improved trainability and performance. We further explore the relationship between network complexity and the loss landscape, suggesting that higher emergence indicates a greater concentration of local minima and a more rugged loss landscape. Pruning, which reduces network complexity by removing redundant nodes and connections, is shown to enhance training efficiency and convergence speed, though it may lead to a reduction in final accuracy. These findings provide new insights into the interplay between emergence, complexity, and performance in neural networks, offering valuable implications for the design and optimization of more efficient architectures.
QUANT-PHMar 27, 2024
Leveraging Quantum Superposition to Infer the Dynamic Behavior of a Spatial-Temporal Neural Network Signaling ModelGabriel A. Silva
The exploration of new problem classes for quantum computation is an active area of research. In this paper, we introduce and solve a novel problem class related to dynamics on large-scale networks relevant to neurobiology and machine learning. Specifically, we ask if a network can sustain inherent dynamic activity beyond some arbitrary observation time or if the activity ceases through quiescence or saturation via an epileptic-like state. We show that this class of problems can be formulated and structured to take advantage of quantum superposition and solved efficiently using a coupled workflow between the Grover and Deutsch-Jozsa quantum algorithms. To do so, we extend their functionality to address the unique requirements of how input (sub)sets into the algorithms must be mathematically structured while simultaneously constructing the inputs so that measurement outputs can be interpreted as meaningful properties of the network dynamics. This, in turn, allows us to answer the question we pose.
LGMar 16, 2021
Learning without gradient descent encoded by the dynamics of a neurobiological modelVivek Kurien George, Vikash Morar, Weiwei Yang et al.
The success of state-of-the-art machine learning is essentially all based on different variations of gradient descent algorithms that minimize some version of a cost or loss function. A fundamental limitation, however, is the need to train these systems in either supervised or unsupervised ways by exposing them to typically large numbers of training examples. Here, we introduce a fundamentally novel conceptual approach to machine learning that takes advantage of a neurobiologically derived model of dynamic signaling, constrained by the geometric structure of a network. We show that MNIST images can be uniquely encoded and classified by the dynamics of geometric networks with nearly state-of-the-art accuracy in an unsupervised way, and without the need for any training.
LGOct 16, 2020
Generalizable Machine Learning in Neuroscience using Graph Neural NetworksPaul Y. Wang, Sandalika Sapra, Vivek Kurien George et al.
Although a number of studies have explored deep learning in neuroscience, the application of these algorithms to neural systems on a microscopic scale, i.e. parameters relevant to lower scales of organization, remains relatively novel. Motivated by advances in whole-brain imaging, we examined the performance of deep learning models on microscopic neural dynamics and resulting emergent behaviors using calcium imaging data from the nematode C. elegans. We show that neural networks perform remarkably well on both neuron-level dynamics prediction, and behavioral state classification. In addition, we compared the performance of structure agnostic neural networks and graph neural networks to investigate if graph structure can be exploited as a favorable inductive bias. To perform this experiment, we designed a graph neural network which explicitly infers relations between neurons from neural activity and leverages the inferred graph structure during computations. In our experiments, we found that graph neural networks generally outperformed structure agnostic models and excel in generalization on unseen organisms, implying a potential path to generalizable machine learning in neuroscience.