LGJun 12, 2023
On the Dynamics of Learning Time-Aware Behavior with Recurrent Neural NetworksPeter DelMastro, Rushiv Arora, Edward Rietman et al.
Recurrent Neural Networks (RNNs) have shown great success in modeling time-dependent patterns, but there is limited research on their learned representations of latent temporal features and the emergence of these representations during training. To address this gap, we use timed automata (TA) to introduce a family of supervised learning tasks modeling behavior dependent on hidden temporal variables whose complexity is directly controllable. Building upon past studies from the perspective of dynamical systems, we train RNNs to emulate temporal flipflops, a new collection of TA that emphasizes the need for time-awareness over long-term memory. We find that these RNNs learn in phases: they quickly perfect any time-independent behavior, but they initially struggle to discover the hidden time-dependent features. In the case of periodic "time-of-day" aware automata, we show that the RNNs learn to switch between periodic orbits that encode time modulo the period of the transition rules. We subsequently apply fixed point stability analysis to monitor changes in the RNN dynamics during training, and we observe that the learning phases are separated by a bifurcation from which the periodic behavior emerges. In this way, we demonstrate how dynamical systems theory can provide insights into not only the learned representations of these models, but also the dynamics of the learning process itself. We argue that this style of analysis may provide insights into the training pathologies of recurrent architectures in contexts outside of time-awareness.
NEOct 3, 2023
Episodic Memory Theory for the Mechanistic Interpretation of Recurrent Neural NetworksArjun Karuvally, Peter Delmastro, Hava T. Siegelmann
Understanding the intricate operations of Recurrent Neural Networks (RNNs) mechanistically is pivotal for advancing their capabilities and applications. In this pursuit, we propose the Episodic Memory Theory (EMT), illustrating that RNNs can be conceptualized as discrete-time analogs of the recently proposed General Sequential Episodic Memory Model. To substantiate EMT, we introduce a novel set of algorithmic tasks tailored to probe the variable binding behavior in RNNs. Utilizing the EMT, we formulate a mathematically rigorous circuit that facilitates variable binding in these tasks. Our empirical investigations reveal that trained RNNs consistently converge to the variable binding circuit, thus indicating universality in the dynamics of RNNs. Building on these findings, we devise an algorithm to define a privileged basis, which reveals hidden neurons instrumental in the temporal storage and composition of variables, a mechanism vital for the successful generalization in these tasks. We show that the privileged basis enhances the interpretability of the learned parameters and hidden states of RNNs. Our work represents a step toward demystifying the internal mechanisms of RNNs and, for computational neuroscience, serves to bridge the gap between artificial neural networks and neural memory models.
NAMay 2
Completely Positive and Trace Preserving Schemes with Tensor Train Compression for the Lindblad EquationPeter DelMastro, Daniel Appelö, Yingda Cheng
We propose a family of low-rank, completely positive and trace preserving schemes for the Lindblad equation, a common model for open quantum systems. Low-rank representation is employed at two levels: the density matrix is factorized into the product of tall-skinny matrices, and the columns of these matrices are further represented using the tensor train (TT) format, also know as matrix product states (MPS). This two-level low-rank format fits naturally into our existing Kraus is King scheme (arXiv:2409.08898v2 [math.NA]) for the Lindblad equation, whose underlying operations are arithmetic on the columns of the tall-skinny matrices. We show how these operations can be performed efficiently in the TT/MPS format, with particular emphasis on density matrix rank-truncation. We conclude with extensive numerical experiments demonstrating the convergence of this scheme and its efficiency in simulating systems with up to $10^{19}$ degrees of freedom using only modest compute resources.
NEJul 28, 2025
Reservoir Computation with Networks of Differentiating Neuron Ring OscillatorsAlexander Yeung, Peter DelMastro, Arjun Karuvally et al.
Reservoir Computing is a machine learning approach that uses the rich repertoire of complex system dynamics for function approximation. Current approaches to reservoir computing use a network of coupled integrating neurons that require a steady current to maintain activity. Here, we introduce a small world graph of differentiating neurons that are active only when there are changes in input as an alternative to integrating neurons as a reservoir computing substrate. We find the coupling strength and network topology that enable these small world networks to function as an effective reservoir. We demonstrate the efficacy of these networks in the MNIST digit recognition task, achieving comparable performance of 90.65% to existing reservoir computing approaches. The findings suggest that differentiating neurons can be a potential alternative to integrating neurons and can provide a sustainable future alternative for power-hungry AI applications.