Manfred Opper

h-index22

39papers

452citations

Novelty52%

AI Score55

Ranked #23,337 of 201,326 authors (top 12%)#184 in ML (top 5%)

39 Papers

LGOct 26, 2023

Generative Fractional Diffusion Models

Gabriel Nobis, Maximilian Springenberg, Marco Aversa et al.

We introduce the first continuous-time score-based generative model that leverages fractional diffusion processes for its underlying dynamics. Although diffusion models have excelled at capturing data distributions, they still suffer from various limitations such as slow convergence, mode-collapse on imbalanced data, and lack of diversity. These issues are partially linked to the use of light-tailed Brownian motion (BM) with independent increments. In this paper, we replace BM with an approximation of its non-Markovian counterpart, fractional Brownian motion (fBM), characterized by correlated increments and Hurst index $H \in (0,1)$, where $H=0.5$ recovers the classical BM. To ensure tractable inference and learning, we employ a recently popularized Markov approximation of fBM (MA-fBM) and derive its reverse-time model, resulting in generative fractional diffusion models (GFDM). We characterize the forward dynamics using a continuous reparameterization trick and propose augmented score matching to efficiently learn the score function, which is partly known in closed form, at minimal added cost. The ability to drive our diffusion model via MA-fBM offers flexibility and control. $H \leq 0.5$ enters the regime of rough paths whereas $H>0.5$ regularizes diffusion paths and invokes long-term memory. The Markov approximation allows added control by varying the number of Markov processes linearly combined to approximate fBM. Our evaluations on real image datasets demonstrate that GFDM achieves greater pixel-wise diversity and enhanced image quality, as indicated by a lower FID, offering a promising alternative to traditional diffusion models

LGOct 19, 2023

Variational Inference for SDEs Driven by Fractional Noise

Rembert Daems, Manfred Opper, Guillaume Crevecoeur et al.

We present a novel variational framework for performing inference in (neural) stochastic differential equations (SDEs) driven by Markov-approximate fractional Brownian motion (fBM). SDEs offer a versatile tool for modeling real-world continuous-time dynamic systems with inherent noise and randomness. Combining SDEs with the powerful inference capabilities of variational methods, enables the learning of representative function distributions through stochastic gradient descent. However, conventional SDEs typically assume the underlying noise to follow a Brownian motion (BM), which hinders their ability to capture long-term dependencies. In contrast, fractional Brownian motion (fBM) extends BM to encompass non-Markovian dynamics, but existing methods for inferring fBM parameters are either computationally demanding or statistically inefficient. In this paper, building upon the Markov approximation of fBM, we derive the evidence lower bound essential for efficient variational inference of posterior path measures, drawing from the well-established field of stochastic analysis. Additionally, we provide a closed-form expression to determine optimal approximation coefficients. Furthermore, we propose the use of neural networks to learn the drift, diffusion and control terms within our variational posterior, leading to the variational training of neural-SDEs. In this framework, we also optimize the Hurst index, governing the nature of our fractional noise. Beyond validation on synthetic data, we contribute a novel architecture for variational latent video prediction,-an approach that, to the best of our knowledge, enables the first variational neural-SDE application to video perception.

OCMay 8

On a mean-field Pontryagin minimum principle for stochastic optimal control

Manfred Opper, Sebastian Reich

This paper outlines a novel extension of the classical Pontryagin minimum (maximum) principle to stochastic optimal control problems. Contrary to the well-known stochastic Pontryagin minimum principle involving forward-backward stochastic differential equations, the proposed formulation is deterministic and of mean-field type. We denote it by the McKean-Pontryagin minimum principle. The Hamiltonian structure of the proposed McKean-Pontryagin minimum principle is achieved via the introduction of a pair of auxiliary functions. A gauge freedom in the choice of one of these two functions can be used to decouple the forward and reverse time equations; hence simplifying the solution of the underlying boundary value problem. We also consider infinite horizon discounted cost optimal control problems. In this case, the mean-field formulation allows one to convert the computation of the desired optimal control law into solving a pair of forward mean-field ordinary differential equations. The McKean-Pontryagin minimum principle is tested numerically for a controlled inverted pendulum, a controlled Lorenz-63 system, and a controlled Lorenz-96 system. Although the focus is on linear-quadratic control problems, the proposed methodology is extendable to more general problems including mean-field type control formulations.

OCApr 28

Digital Twins: McKean-Pontryagin Control for Partially Observed Physical Twins

Manfred Opper, Sebastian Reich

Optimal control for fully observed diffusion processes is well established and has led to numerous numerical implementations based on, for example, Bellman's principle, model free reinforcement learning, Pontryagin's maximum principle, and model predictive control. In contrast, much fewer algorithms are available for optimal control of partially observed processes. However, this scenario is central to the digital twin paradigm, where a physical twin is partially observed and control laws are derived based on a digital twin. In this paper, we contribute to this challenge by combining data assimilation in the form of the ensemble Kalman filter with the recently proposed McKean-Pontryagin approach to stochastic optimal control. We derive forward evolving mean-field evolution equations for states and co-states which simultaneously allow for an online assimilation of data as well as an online computation of control laws. The proposed methodology is therefore perfectly suited for real time applications of digital twins. We present numerical results for controlled Lorenz-63 and Lorenz-96 systems as well as an inverted pendulum.

LGMay 11

Variational Inference for Lévy Process-Driven SDEs via Neural Tilting

Yaman Kindap, Manfred Opper, Benjamin Dupuis et al.

Modelling extreme events and heavy-tailed phenomena is central to building reliable predictive systems in domains such as finance, climate science, and safety-critical AI. While Lévy processes provide a natural mathematical framework for capturing jumps and heavy tails, Bayesian inference for Lévy-driven stochastic differential equations (SDEs) remains intractable with existing methods: Monte Carlo approaches are rigorous but lack scalability, whereas neural variational inference methods are efficient but rely on Gaussian assumptions that fail to capture discontinuities. We address this tension by introducing a neural exponential tilting framework for variational inference in Lévy-driven SDEs. Our approach constructs a flexible variational family by exponentially reweighting the Lévy measure using neural networks. This parametrization preserves the jump structure of the underlying process while remaining computationally tractable. To enable efficient inference, we develop a quadratic neural parametrization that yields closed-form normalization of the tilted measure, a conditional Gaussian representation for stable processes that facilitates simulation, and symmetry-aware Monte Carlo estimators for scalable optimization. Empirically, we demonstrate that the method accurately captures jump dynamics and yields reliable posterior inference in regimes where Gaussian-based variational approaches fail, on both synthetic and real-world datasets.

LGNov 3, 2025

Fractional Diffusion Bridge Models

Gabriel Nobis, Maximilian Springenberg, Arina Belova et al.

We present Fractional Diffusion Bridge Models (FDBM), a novel generative diffusion bridge framework driven by an approximation of the rich and non-Markovian fractional Brownian motion (fBM). Real stochastic processes exhibit a degree of memory effects (correlations in time), long-range dependencies, roughness and anomalous diffusion phenomena that are not captured in standard diffusion or bridge modeling due to the use of Brownian motion (BM). As a remedy, leveraging a recent Markovian approximation of fBM (MA-fBM), we construct FDBM that enable tractable inference while preserving the non-Markovian nature of fBM. We prove the existence of a coupling-preserving generative diffusion bridge and leverage it for future state prediction from paired training data. We then extend our formulation to the Schrödinger bridge problem and derive a principled loss function to learn the unpaired data translation. We evaluate FDBM on both tasks: predicting future protein conformations from aligned data, and unpaired image translation. In both settings, FDBM achieves superior performance compared to the Brownian baselines, yielding lower root mean squared deviation (RMSD) of C$_α$ atomic positions in protein structure prediction and lower Fréchet Inception Distance (FID) in unpaired image translation.

LGFeb 13, 2024

A Convergence Analysis of Approximate Message Passing with Non-Separable Functions and Applications to Multi-Class Classification

Burak Çakmak, Yue M. Lu, Manfred Opper

Motivated by the recent application of approximate message passing (AMP) to the analysis of convex optimizations in multi-class classifications [Loureiro, et. al., 2021], we present a convergence analysis of AMP dynamics with non-separable multivariate nonlinearities. As an application, we present a complete (and independent) analysis of the motivated convex optimization problem.

LGMay 22, 2025

Efficient Training of Neural SDEs Using Stochastic Optimal Control

Rembert Daems, Manfred Opper, Guillaume Crevecoeur et al.

We present a hierarchical, control theory inspired method for variational inference (VI) for neural stochastic differential equations (SDEs). While VI for neural SDEs is a promising avenue for uncertainty-aware reasoning in time-series, it is computationally challenging due to the iterative nature of maximizing the ELBO. In this work, we propose to decompose the control term into linear and residual non-linear components and derive an optimal control term for linear SDEs, using stochastic optimal control. Modeling the non-linear component by a neural network, we show how to efficiently train neural SDEs without sacrificing their expressive power. Since the linear part of the control term is optimal and does not need to be learned, the training is initialized at a lower cost and we observe faster convergence.

HEP-PHSep 1, 2025

Multimodal Generative Flows for LHC Jets

Darius A. Faroughy, Manfred Opper, Cesar Ojeda

Generative modeling of high-energy collisions at the Large Hadron Collider (LHC) offers a data-driven route to simulations, anomaly detection, among other applications. A central challenge lies in the hybrid nature of particle-cloud data: each particle carries continuous kinematic features and discrete quantum numbers such as charge and flavor. We introduce a transformer-based multimodal flow that extends flow-matching with a continuous-time Markov jump bridge to jointly model LHC jets with both modalities. Trained on CMS Open Data, our model can generate high fidelity jets with realistic kinematics, jet substructure and flavor composition.

MLMay 6, 2024

Bridging discrete and continuous state spaces: Exploring the Ehrenfest process in time-continuous diffusion models

Ludwig Winkler, Lorenz Richter, Manfred Opper

Generative modeling via stochastic processes has led to remarkable empirical results as well as to recent advances in their theoretical understanding. In principle, both space and time of the processes can be discrete or continuous. In this work, we study time-continuous Markov jump processes on discrete state spaces and investigate their correspondence to state-continuous diffusion processes given by SDEs. In particular, we revisit the $\textit{Ehrenfest process}$, which converges to an Ornstein-Uhlenbeck process in the infinite state space limit. Likewise, we can show that the time-reversal of the Ehrenfest process converges to the time-reversed Ornstein-Uhlenbeck process. This observation bridges discrete and continuous state spaces and allows to carry over methods from one to the respective other setting. Additionally, we suggest an algorithm for training the time-reversal of Markov jump processes which relies on conditional expectations and can thus be directly related to denoising score matching. We demonstrate our methods in multiple convincing numerical experiments.

LGFeb 16, 2022

Analysis of Random Sequential Message Passing Algorithms for Approximate Inference

Burak Çakmak, Yue M. Lu, Manfred Opper

We analyze the dynamics of a random sequential message passing algorithm for approximate inference with large Gaussian latent variable models in a student-teacher scenario. To model nontrivial dependencies between the latent variables, we assume random covariance matrices drawn from rotation invariant ensembles. Moreover, we consider a model mismatching setting, where the teacher model and the one used by the student may be different. By means of dynamical functional approach, we obtain exact dynamical mean-field equations characterizing the dynamics of the inference algorithm. We also derive a range of model parameters for which the sequential algorithm does not converge. The boundary of this parameter range coincides with the de Almeida Thouless (AT) stability condition of the replica symmetric ansatz for the static probabilistic model.

MLJul 21, 2021

Adaptive Inducing Points Selection For Gaussian Processes

Théo Galy-Fajou, Manfred Opper

Gaussian Processes (\textbf{GPs}) are flexible non-parametric models with strong probabilistic interpretation. While being a standard choice for performing inference on time series, GPs have few techniques to work in a streaming setting. \cite{bui2017streaming} developed an efficient variational approach to train online GPs by using sparsity techniques: The whole set of observations is approximated by a smaller set of inducing points (\textbf{IPs}) and moved around with new data. Both the number and the locations of the IPs will affect greatly the performance of the algorithm. In addition to optimizing their locations, we propose to adaptively add new points, based on the properties of the GP and the structure of the data.

MLMay 20, 2021

Nonlinear Hawkes Process with Gaussian Process Self Effects

Noa Malem-Shinitski, Cesar Ojeda, Manfred Opper

Traditionally, Hawkes processes are used to model time--continuous point processes with history dependence. Here we propose an extended model where the self--effects are of both excitatory and inhibitory type and follow a Gaussian Process. Whereas previous work either relies on a less flexible parameterization of the model, or requires a large amount of data, our formulation allows for both a flexible model and learning when data are scarce. We continue the line of work of Bayesian inference for Hawkes processes, and our approach dispenses with the necessity of estimating a branching structure for the posterior, as we perform inference on an aggregated sum of Gaussian Processes. Efficient approximate Bayesian inference is achieved via data augmentation, and we describe a mean--field variational inference approach to learn the model parameters. To demonstrate the flexibility of the model we apply our methodology on data from three different domains and compare it to previously reported results.

DIS-NNJan 5, 2021

Exact solution to the random sequential dynamics of a message passing algorithm

Burak Çakmak, Manfred Opper

We analyze the random sequential dynamics of a message passing algorithm for Ising models with random interactions in the large system limit. We derive exact results for the two-time correlation functions and the speed of convergence. The {\em de Almedia-Thouless} stability criterion of the static problem is found to be necessary and sufficient for the global convergence of the random sequential dynamics.

LGMay 4, 2020

A Dynamical Mean-Field Theory for Learning in Restricted Boltzmann Machines

Burak Çakmak, Manfred Opper

We define a message-passing algorithm for computing magnetizations in Restricted Boltzmann machines, which are Ising models on bipartite graphs introduced as neural network models for probability distributions over spin configurations. To model nontrivial statistical dependencies between the spins' couplings, we assume that the rectangular coupling matrix is drawn from an arbitrary bi-rotation invariant random matrix ensemble. Using the dynamical functional method of statistical mechanics we exactly analyze the dynamics of the algorithm in the large system limit. We prove the global convergence of the algorithm under a stability criterion and compute asymptotic convergence rates showing excellent agreement with numerical simulations.

MLFeb 26, 2020

Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models

Théo Galy-Fajou, Florian Wenzel, Manfred Opper

We propose automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models. Our method automatically constructs an auxiliary variable augmentation that renders the GP model conditionally conjugate. Building on the conjugate structure of the augmented model, we develop two inference methods. First, a fast and scalable stochastic variational inference method that uses efficient block coordinate ascent updates, which are computed in closed form. Second, an asymptotically correct Gibbs sampler that is useful for small datasets. Our experiments show that our method are up two orders of magnitude faster and more robust than existing state-of-the-art black-box methods.

STAT-MECHFeb 3, 2020

Understanding the dynamics of message passing algorithms: a free probability heuristics

Manfred Opper, Burak Çakmak

We use freeness assumptions of random matrix theory to analyze the dynamical behavior of inference algorithms for probabilistic models with dense coupling matrices in the limit of large systems. For a toy Ising model, we are able to recover previous results such as the property of vanishing effective memories and the analytical convergence rate of the algorithm.

LGJan 14, 2020

Analysis of Bayesian Inference Algorithms by the Dynamical Functional Approach

Burak Çakmak, Manfred Opper

We analyze the dynamics of an algorithm for approximate inference with large Gaussian latent variable models in a student-teacher scenario. To model nontrivial dependencies between the latent variables, we assume random covariance matrices drawn from rotation invariant ensembles. For the case of perfect data-model matching, the knowledge of static order parameters derived from the replica method allows us to obtain efficient algorithmic updates in terms of matrix-vector multiplications with a fixed matrix. Using the dynamical functional approach, we obtain an exact effective stochastic process in the thermodynamic limit for a single node. From this, we obtain closed-form expressions for the rate of the convergence. Analytical results are excellent agreement with simulations of single instances of large models.

MLSep 30, 2019

Tightening Bounds for Variational Inference by Revisiting Perturbation Theory

Robert Bamler, Cheng Zhang, Manfred Opper et al.

Variational inference has become one of the most widely used methods in latent variable modeling. In its basic form, variational inference employs a fully factorized variational distribution and minimizes its KL divergence to the posterior. As the minimization can only be carried out approximately, this approximation induces a bias. In this paper, we revisit perturbation theory as a powerful way of improving the variational approximation. Perturbation theory relies on a form of Taylor expansion of the log marginal likelihood, vaguely in terms of the log ratio of the true posterior and its variational approximation. While first order terms give the classical variational bound, higher-order terms yield corrections that tighten it. However, traditional perturbation theory does not provide a lower bound, making it inapt for stochastic optimization. In this paper, we present a similar yet alternative way of deriving corrections to the ELBO that resemble perturbation theory, but that result in a valid bound. We show in experiments on Gaussian Processes and Variational Autoencoders that the new bounds are more mass covering, and that the resulting posterior covariances are closer to the true posterior and lead to higher likelihoods on held-out data.

MLMay 23, 2019

Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation

Théo Galy-Fajou, Florian Wenzel, Christian Donner et al.

We propose a new scalable multi-class Gaussian process classification approach building on a novel modified softmax likelihood function. The new likelihood has two benefits: it leads to well-calibrated uncertainty estimates and allows for an efficient latent variable augmentation. The augmented model has the advantage that it is conditionally conjugate leading to a fast variational inference method via block coordinate ascent updates. Previous approaches suffered from a trade-off between uncertainty calibration and speed. Our experiments show that our method leads to well-calibrated uncertainty estimates and competitive predictive performance while being up to two orders faster than the state of the art.

DIS-NNJan 24, 2019

Memory-free dynamics for the TAP equations of Ising models with arbitrary rotation invariant ensembles of random coupling matrices

Burak Çakmak, Manfred Opper

We propose an iterative algorithm for solving the Thouless-Anderson-Palmer (TAP) equations of Ising models with arbitrary rotation invariant (random) coupling matrices. In the thermodynamic limit, we prove by means of the dynamical functional method that the proposed algorithm converges when the so-called de Almeida Thouless (AT) criterion is fulfilled. Moreover, we give exact analytical expressions for the rate of the convergence.

MLAug 2, 2018

Efficient Bayesian Inference of Sigmoidal Gaussian Cox Processes

Christian Donner, Manfred Opper

We present an approximate Bayesian inference approach for estimating the intensity of an inhomogeneous Poisson process, where the intensity function is modelled using a Gaussian process (GP) prior via a sigmoid link function. Augmenting the model using a latent marked Poisson process and Pólya--Gamma random variables we obtain a representation of the likelihood which is conjugate to the GP prior. We estimate the posterior using a variational free--form mean field optimisation together with the framework of sparse GPs. Furthermore, as alternative approximation we suggest a sparse Laplace's method for the posterior, for which an efficient expectation--maximisation algorithm is derived to find the posterior's mode. Both algorithms compare well against exact inference obtained by a Markov Chain Monte Carlo sampler and standard variational Gauss approach solving the same model, while being one order of magnitude faster. Furthermore, the performance and speed of our method is competitive with that of another recently proposed Poisson process model based on a quadratic link function, while not being limited to GPs with squared exponential kernels and rectangular domains.

MLMay 29, 2018

Efficient Bayesian Inference for a Gaussian Process Density Model

Christian Donner, Manfred Opper

We reconsider a nonparametric density model based on Gaussian processes. By augmenting the model with latent Pólya--Gamma random variables and a latent marked Poisson process we obtain a new likelihood which is conjugate to the model's Gaussian process prior. The augmented posterior allows for efficient inference by Gibbs sampling and an approximate variational mean field approach. For the latter we utilise sparse GP approximations to tackle the infinite dimensionality of the problem. The performance of both algorithms and comparisons with other density estimators are demonstrated on artificial and real datasets with up to several thousand data points.

MLFeb 18, 2018

Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation

Florian Wenzel, Theo Galy-Fajou, Christan Donner et al.

We propose a scalable stochastic variational approach to GP classification building on Polya-Gamma data augmentation and inducing points. Unlike former approaches, we obtain closed-form updates based on natural gradients that lead to efficient optimization. We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to two orders of magnitude faster than the state-of-the-art while being competitive in terms of prediction performance.

ITJan 16, 2018

Expectation Propagation for Approximate Inference: Free Probability Framework

Burak Çakmak, Manfred Opper

We study asymptotic properties of expectation propagation (EP) -- a method for approximate inference originally developed in the field of machine learning. Applied to generalized linear models, EP iteratively computes a multivariate Gaussian approximation to the exact posterior distribution. The computational complexity of the repeated update of covariance matrices severely limits the application of EP to large problem sizes. In this study, we present a rigorous analysis by means of free probability theory that allows us to overcome this computational bottleneck if specific data matrices in the problem fulfill certain properties of asymptotic freeness. We demonstrate the relevance of our approach on the gene selection problem of a microarray dataset.

MLSep 21, 2017

Perturbative Black Box Variational Inference

Robert Bamler, Cheng Zhang, Manfred Opper et al.

Black box variational inference (BBVI) with reparameterization gradients triggered the exploration of divergence measures other than the Kullback-Leibler (KL) divergence, such as alpha divergences. In this paper, we view BBVI with generalized divergences as a form of estimating the marginal likelihood via biased importance sampling. The choice of divergence determines a bias-variance trade-off between the tightness of a bound on the marginal likelihood (low bias) and the variance of its gradient estimators. Drawing on variational perturbation theory of statistical physics, we use these insights to construct a family of new variational bounds. Enumerated by an odd integer order $K$, this family captures the standard KL bound for $K=1$, and converges to the exact marginal likelihood as $K\to\infty$. Compared to alpha-divergences, our reparameterization gradients have a lower variance. We show in experiments on Gaussian Processes and Variational Autoencoders that the new bounds are more mass covering, and that the resulting posterior covariances are closer to the true posterior and lead to higher likelihoods on held-out data.

MLSep 4, 2017

Inverse Ising problem in continuous time: A latent variable approach

Christian Donner, Manfred Opper

We consider the inverse Ising problem, i.e. the inference of network couplings from observed spin trajectories for a model with continuous time Glauber dynamics. By introducing two sets of auxiliary latent random variables we render the likelihood into a form, which allows for simple iterative inference algorithms with analytical updates. The variables are: (1) Poisson variables to linearise an exponential term which is typical for point process likelihoods and (2) Pólya-Gamma variables, which make the likelihood quadratic in the coupling parameters. Using the augmented likelihood, we derive an expectation-maximization (EM) algorithm to obtain the maximum likelihood estimate of network parameters. Using a third set of latent variables we extend the EM algorithm to sparse couplings via L1 regularization. Finally, we develop an efficient approximate Bayesian inference algorithm using a variational approach. We demonstrate the performance of our algorithms on data simulated from an Ising model. For data which are simulated from a more biologically plausible network with spiking neurons, we show that the Ising model captures well the low order statistics of the data and how the Ising couplings are related to the underlying synaptic structure of the simulated network.

DIS-NNMay 15, 2017

A statistical physics approach to learning curves for the Inverse Ising problem

Ludovica Bachschmid-Romano, Manfred Opper

Using methods of statistical physics, we analyse the error of learning couplings in large Ising models from independent data (the inverse Ising problem). We concentrate on learning based on local cost functions, such as the pseudo-likelihood method for which the couplings are inferred independently for each spin. Assuming that the data are generated from a true Ising model, we compute the reconstruction error of the couplings using a combination of the replica method with the cavity approach for densely connected systems. We show that an explicit estimator based on a quadratic cost function achieves minimal reconstruction error, but requires the length of the true coupling vector as prior knowledge. A simple mean field estimator of the couplings which does not need such knowledge is asymptotically optimal, i.e. when the number of observations is much large than the number of spins. Comparison of the theory with numerical simulations shows excellent agreement for data generated from two models with random couplings in the high temperature region: a model with independent couplings (Sherrington-Kirkpatrick model), and a model where the matrix of couplings has a Wishart distribution.

DATA-ANFeb 17, 2017

Approximate Bayes learning of stochastic differential equations

Philipp Batz, Andreas Ruttor, Manfred Opper

We introduce a nonparametric approach for estimating drift and diffusion functions in systems of stochastic differential equations from observations of the state vector. Gaussian processes are used as flexible models for these functions and estimates are calculated directly from dense data sets using Gaussian process regression. We also develop an approximate expectation maximization algorithm to deal with the unobserved, latent dynamics between sparse observations. The posterior over states is approximated by a piecewise linearized process of the Ornstein-Uhlenbeck type and the maximum a posteriori estimation of the drift is facilitated by a sparse Gaussian process approximation.

MLSep 12, 2016

Optimal Encoding and Decoding for Point Process Observations: an Approximate Closed-Form Filter

Yuval Harel, Ron Meir, Manfred Opper

The process of dynamic state estimation (filtering) based on point process observations is in general intractable. Numerical sampling techniques are often practically useful, but lead to limited conceptual insight about optimal encoding/decoding strategies, which are of significant relevance to Computational Neuroscience. We develop an analytically tractable Bayesian approximation to optimal filtering based on point process observations, which allows us to introduce distributional assumptions about sensor properties, that greatly facilitate the analysis of optimal encoding in situations deviating from common assumptions of uniform coding. Numerical comparison with particle filtering demonstrate the quality of the approximation. The analytic framework leads to insights which are difficult to obtain from numerical algorithms, and is consistent with biological observations about the distribution of sensory cells' tuning curve centers.

ITAug 23, 2016

Self-Averaging Expectation Propagation

Burak Çakmak, Manfred Opper, Bernard H. Fleury et al.

We investigate the problem of approximate Bayesian inference for a general class of observation models by means of the expectation propagation (EP) framework for large systems under some statistical assumptions. Our approach tries to overcome the numerical bottleneck of EP caused by the inversion of large matrices. Assuming that the measurement matrices are realizations of specific types of ensembles we use the concept of freeness from random matrix theory to show that the EP cavity variances exhibit an asymptotic self-averaging property. They can be pre-computed using specific generating functions, i.e. the R- and/or S-transforms in free probability, which do not require matrix inversions. Our approach extends the framework of (generalized) approximate message passing -- assumes zero-mean iid entries of the measurement matrix -- to a general class of random matrix ensembles. The generalization is via a simple formulation of the R- and/or S-transforms of the limiting eigenvalue distribution of the Gramian of the measurement matrix. We demonstrate the performance of our approach on a signal recovery problem of nonlinear compressed sensing and compare it with that of EP.

DIS-NNJul 28, 2016

Variational perturbation and extended Plefka approaches to dynamics on random networks: the case of the kinetic Ising model

Ludovica Bachschmid-Romano, Claudia Battistin, Manfred Opper et al.

We describe and analyze some novel approaches for studying the dynamics of Ising spin glass models. We first briefly consider the variational approach based on minimizing the Kullback-Leibler divergence between independent trajectories and the real ones and note that this approach only coincides with the mean field equations from the saddle point approximation to the generating functional when the dynamics is defined through a logistic link function, which is the case for the kinetic Ising model with parallel update. We then spend the rest of the paper developing two ways of going beyond the saddle point approximation to the generating functional. In the first one, we develop a variational perturbative approximation to the generating functional by expanding the action around a quadratic function of the local fields and conjugate local fields whose parameters are optimized. We derive analytical expressions for the optimal parameters and show that when the optimization is suitably restricted, we recover the mean field equations that are exact for the fully asymmetric random couplings (Mézard and Sakellariou, 2011). However, without this restriction the results are different. We also describe an extended Plefka expansion in which in addition to the magnetization, we also fix the correlation and response functions. Finally, we numerically study the performance of these approximations for Sherrington-Kirkpatrick type couplings for various coupling strengths, degrees of coupling symmetry and external fields. We show that the dynamical equations derived from the extended Plefka expansion outperform the others in all regimes, although it is computationally more demanding. The unconstrained variational approach does not perform well in the small coupling regime, while it approaches dynamical TAP equations of (Roudi and Hertz, 2011) for strong couplings.

MLDec 18, 2015

Expectation propagation for continuous time stochastic processes

Botond Cseke, David Schnoerr, Manfred Opper et al.

We consider the inverse problem of reconstructing the posterior measure over the trajec- tories of a diffusion process from discrete time observations and continuous time constraints. We cast the problem in a Bayesian framework and derive approximations to the posterior distributions of single time marginals using variational approximate inference. We then show how the approximation can be extended to a wide class of discrete-state Markov jump pro- cesses by making use of the chemical Langevin equation. Our empirical results show that the proposed method is computationally efficient and provides good approximations for these classes of inverse problems.

MLJul 28, 2015

An Analytically Tractable Bayesian Approximation to Optimal Point Process Filtering

Yuval Harel, Ron Meir, Manfred Opper

The process of dynamic state estimation (filtering) based on point process observations is in general intractable. Numerical sampling techniques are often practically useful, but lead to limited conceptual insight about optimal encoding/decoding strategies, which are of significant relevance to Computational Neuroscience. We develop an analytically tractable Bayesian approximation to optimal filtering based on point process observations, which allows us to introduce distributional assumptions about sensory cell properties, that greatly facilitates the analysis of optimal encoding in situations deviating from common assumptions of uniform coding. The analytic framework leads to insights which are difficult to obtain from numerical algorithms, and is consistent with experiments about the distribution of tuning curve centers. Interestingly, we find that the information gained from the absence of spikes may be crucial to performance.

MLSep 22, 2014

Expectation Propagation

Jack Raymond, Andre Manoel, Manfred Opper

Variational inference is a powerful concept that underlies many iterative approximation algorithms; expectation propagation, mean-field methods and belief propagations were all central themes at the school that can be perceived from this unifying framework. The lectures of Manfred Opper introduce the archetypal example of Expectation Propagation, before establishing the connection with the other approximation methods. Corrections by expansion about the expectation propagation are then explained. Finally some advanced inference topics and applications are explored in the final sections.

MLJun 27, 2014

Optimal Population Codes for Control and Estimation

Alex Susemihl, Ron Meir, Manfred Opper

Agents acting in the natural world aim at selecting appropriate actions based on noisy and partial sensory observations. Many behaviors leading to decision mak- ing and action selection in a closed loop setting are naturally phrased within a control theoretic framework. Within the framework of optimal Control Theory, one is usually given a cost function which is minimized by selecting a control law based on the observations. While in standard control settings the sensors are assumed fixed, biological systems often gain from the extra flexibility of optimiz- ing the sensors themselves. However, this sensory adaptation is geared towards control rather than perception, as is often assumed. In this work we show that sen- sory adaptation for control differs from sensory adaptation for perception, even for simple control setups. This implies, consistently with recent experimental results, that when studying sensory adaptation, it is essential to account for the task being performed.

MLNov 8, 2013

Visualizing the Effects of a Changing Distance on Data Using Continuous Embeddings

Gina Gruenhage, Manfred Opper, Simon Barthelme

Most Machine Learning (ML) methods, from clustering to classification, rely on a distance function to describe relationships between datapoints. For complex datasets it is hard to avoid making some arbitrary choices when defining a distance function. To compare images, one must choose a spatial scale, for signals, a temporal scale. The right scale is hard to pin down and it is preferable when results do not depend too tightly on the exact value one picked. Topological data analysis seeks to address this issue by focusing on the notion of neighbourhood instead of distance. It is shown that in some cases a simpler solution is available. It can be checked how strongly distance relationships depend on a hyperparameter using dimensionality reduction. A variant of dynamical multi-dimensional scaling (MDS) is formulated, which embeds datapoints as curves. The resulting algorithm is based on the Concave-Convex Procedure (CCCP) and provides a simple and efficient way of visualizing changes and invariances in distance patterns as a hyperparameter is varied. A variant to analyze the dependence on multiple hyperparameters is also presented. A cMDS algorithm that is straightforward to implement, use and extend is provided. To illustrate the possibilities of cMDS, cMDS is applied to several real-world data sets.

MLSep 12, 2013

Temporal Autoencoding Improves Generative Models of Time Series

Chris Häusler, Alex Susemihl, Martin P Nawrot et al.

Restricted Boltzmann Machines (RBMs) are generative models which can learn useful representations from samples of a dataset in an unsupervised fashion. They have been widely employed as an unsupervised pre-training method in machine learning. RBMs have been modified to model time series in two main ways: The Temporal RBM stacks a number of RBMs laterally and introduces temporal dependencies between the hidden layer units; The Conditional RBM, on the other hand, considers past samples of the dataset as a conditional bias and learns a representation which takes these into account. Here we propose a new training method for both the TRBM and the CRBM, which enforces the dynamic structure of temporal datasets. We do so by treating the temporal models as denoising autoencoders, considering past frames of the dataset as corrupted versions of the present frame and minimizing the reconstruction error of the present data by the model. We call this approach Temporal Autoencoding. This leads to a significant improvement in the performance of both models in a filling-in-frames task across a number of datasets. The error reduction for motion capture data is 56\% for the CRBM and 80\% for the TRBM. Taking the posterior mean prediction instead of single samples further improves the model's estimates, decreasing the error by as much as 91\% for the CRBM on motion capture data. We also trained the model to perform forecasting on a large number of datasets and have found TA pretraining to consistently improve the performance of the forecasts. Furthermore, by looking at the prediction error across time, we can see that this improvement reflects a better representation of the dynamics of the data as opposed to a bias towards reconstructing the observed data on a short time scale.

MLJan 12, 2013

Perturbative Corrections for Approximate Inference in Gaussian Latent Variable Models

Manfred Opper, Ulrich Paquet, Ole Winther

Expectation Propagation (EP) provides a framework for approximate inference. When the model under consideration is over a latent Gaussian field, with the approximation being Gaussian, we show how these approximations can systematically be corrected. A perturbative expansion is made of the exact but intractable correction, and can be applied to the model's partition function and other moments of interest. The correction is expressed over the higher-order cumulants which are neglected by EP's local matching of moments. Through the expansion, we see that EP is correct to first order. By considering higher orders, corrections of increasing polynomial complexity can be applied to the approximation. The second order provides a correction in quadratic time, which we apply to an array of Gaussian process and Ising models. The corrections generalize to arbitrarily complex approximating families, which we illustrate on tree-structured Ising model approximations. Furthermore, they provide a polynomial-time assessment of the approximation error. We also provide both theoretical and practical insights on the exactness of the EP solution.