MLMay 29
Memory by Design: Probabilistic Sequence LayersMatthew Dowling, Hyungju Jeon, Cristina Savin et al.
We introduce the design-model framework: a way to derive efficient recurrent sequence maps from explicit assumptions about memory. A design model writes evidence into memory by exact Bayesian filtering; a query-dependent readout produces a predictive distribution whose mean is the layer output. In our linear-Gaussian instantiation, the \emph{Bayesian Layer} propagates both a mean and a covariance: the covariance tracks uncertainty over stored associations, steering writes toward uncertain directions, attenuating gains as evidence accumulates, and preserving confident memories. The same framework unifies several sub-quadratic recurrences. Linear attention, GLA, and Mamba-2/SSD are exact filters under one design model, whereas DeltaNet and related Delta-rule models arise as covariance-reset reductions under another. Restoring the covariance yields closed-form predictions for retrieval dynamics, verified empirically, and improves robustness beyond the training regime across controlled collision studies, learned associative recall, and the Zoology MQAR benchmark; distilling Bayesian Layers into a pretrained 340M Gated DeltaNet improves RULER long-context retrieval at matched compute.
NCJun 1, 2023
Linear Time GPs for Inferring Latent Trajectories from Neural Spike TrainsMatthew Dowling, Yuan Zhao, Il Memming Park
Latent Gaussian process (GP) models are widely used in neuroscience to uncover hidden state evolutions from sequential observations, mainly in neural activity recordings. While latent GP models provide a principled and powerful solution in theory, the intractable posterior in non-conjugate settings necessitates approximate inference schemes, which may lack scalability. In this work, we propose cvHM, a general inference framework for latent GP models leveraging Hida-Matérn kernels and conjugate computation variational inference (CVI). With cvHM, we are able to perform variational inference of latent neural trajectories with linear time complexity for arbitrary likelihoods. The reparameterization of stationary kernels using Hida-Matérn GPs helps us connect the latent variable models that encode prior assumptions through dynamical systems to those that encode trajectory assumptions through GPs. In contrast to previous work, we use bidirectional information filtering, leading to a more concise implementation. Furthermore, we employ the Whittle approximate likelihood to achieve highly efficient hyperparameter learning.
NCApr 26
Integrative neurocybernetic modeling in the era of large-scale neuroscienceIl Memming Park, Ayesha Vermani, Gonzalo G. de Polavieja et al.
Large-scale neuroscience is generating rich datasets across animals, brain areas and behavioral contexts, yet our modeling efforts remains fragmented across isolated experiments. We argue that understanding behavior requires integrative neurocybernetic models: understandable dynamical models that capture the closed-loop coupling of brain, body and environment, treat the brain as a controller pursuing latent objectives, represent structured variation across scales, and scale to heterogeneous datasets. Such models shift the goal from predicting neural recordings in isolation to inferring the organizing principles that govern neural and behavioral dynamics. We outline a practical route toward this goal by combining nonlinear state-space models and meta-dynamical extensions with scalable inference, knowledge distillation, mixed open- and closed-loop training, and connectomics-informed architectures. By pooling complementary constraints from recordings, behavior, perturbations and anatomy, integrative neurocybernetic models can provide statistical amplification, few-shot generalization, and mechanistic insight into shared dynamical structure, individual variation, and the control objectives that govern behavior. This agenda offers a model-centric path from fragmented data to a mechanistic science of how brains produce behavior.
MLMar 3, 2024
eXponential FAmily Dynamical Systems (XFADS): Large-scale nonlinear Gaussian state-space modelingMatthew Dowling, Yuan Zhao, Il Memming Park
State-space graphical models and the variational autoencoder framework provide a principled apparatus for learning dynamical systems from data. State-of-the-art probabilistic approaches are often able to scale to large problems at the cost of flexibility of the variational posterior or expressivity of the dynamics model. However, those consolidations can be detrimental if the ultimate goal is to learn a generative model capable of explaining the spatiotemporal structure of the data and making accurate forecasts. We introduce a low-rank structured variational autoencoding framework for nonlinear Gaussian state-space graphical models capable of capturing dense covariance structures that are important for learning dynamical systems with predictive capabilities. Our inference algorithm exploits the covariance structures that arise naturally from sample based approximate Gaussian message passing and low-rank amortized posterior updates -- effectively performing approximate variational smoothing with time complexity scaling linearly in the state dimensionality. In comparisons with other deep state-space model architectures our approach consistently demonstrates the ability to learn a more predictive generative model. Furthermore, when applied to neural physiological recordings, our approach is able to learn a dynamical system capable of forecasting population spiking and behavioral correlates from a small portion of single trials.
MLMay 18, 2023
Real-Time Variational Method for Learning Neural Trajectory and its DynamicsMatthew Dowling, Yuan Zhao, Il Memming Park
Latent variable models have become instrumental in computational neuroscience for reasoning about neural computation. This has fostered the development of powerful offline algorithms for extracting latent neural trajectories from neural recordings. However, despite the potential of real time alternatives to give immediate feedback to experimentalists, and enhance experimental design, they have received markedly less attention. In this work, we introduce the exponential family variational Kalman filter (eVKF), an online recursive Bayesian method aimed at inferring latent trajectories while simultaneously learning the dynamical system generating them. eVKF works for arbitrary likelihoods and utilizes the constant base measure exponential family to model the latent state stochasticity. We derive a closed-form variational analogue to the predict step of the Kalman filter which leads to a provably tighter bound on the ELBO compared to another online variational method. We validate our method on synthetic and real-world data, and, notably, show that it achieves competitive performance
MLJul 15, 2021
Hida-Matérn KernelMatthew Dowling, Piotr Sokół, Il Memming Park
We present the class of Hida-Matérn kernels, which is the canonical family of covariance functions over the entire space of stationary Gauss-Markov Processes. It extends upon Matérn kernels, by allowing for flexible construction of priors over processes with oscillatory components. Any stationary kernel, including the widely used squared-exponential and spectral mixture kernels, are either directly within this class or are appropriate asymptotic limits, demonstrating the generality of this class. Taking advantage of its Markovian nature we show how to represent such processes as state space models using only the kernel and its derivatives. In turn this allows us to perform Gaussian Process inference more efficiently and side step the usual computational burdens. We also show how exploiting special properties of the state space representation enables improved numerical stability in addition to further reductions of computational complexity.
MLSep 2, 2020
Non-parametric generalized linear modelMatthew Dowling, Yuan Zhao, Il Memming Park
A fundamental problem in statistical neuroscience is to model how neurons encode information by analyzing electrophysiological recordings. A popular and widely-used approach is to fit the spike trains with an autoregressive point process model. These models are characterized by a set of convolutional temporal filters, whose subsequent analysis can help reveal how neurons encode stimuli, interact with each other, and process information. In practice a sufficiently rich but small ensemble of temporal basis functions needs to be chosen to parameterize the filters. However, obtaining a satisfactory fit often requires burdensome model selection and fine tuning the form of the basis functions and their temporal span. In this paper we propose a nonparametric approach for jointly inferring the filters and hyperparameters using the Gaussian process framework. Our method is computationally efficient taking advantage of the sparse variational approximation while being flexible and rich enough to characterize arbitrary filters in continuous time lag. Moreover, our method automatically learns the temporal span of the filter. For the particular application in neuroscience, we designed priors for stimulus and history filters useful for the spike trains. We compare and validate our method on simulated and real neural spike train data.