MLApr 9, 2023
PriorCVAE: scalable MCMC parameter inference with Bayesian deep generative modellingElizaveta Semenova, Prakhar Verma, Max Cairney-Leeming et al. · oxford
Recent advances have shown that GP priors, or their finite realisations, can be encoded using deep generative models such as variational autoencoders (VAEs). These learned generators can serve as drop-in replacements for the original priors during MCMC inference. While this approach enables efficient inference, it loses information about the hyperparameters of the original models, and consequently makes inference over hyperparameters impossible and the learned priors indistinct. To overcome this limitation, we condition the VAE on stochastic process hyperparameters. This allows the joint encoding of hyperparameters with GP realizations and their subsequent estimation during inference. Further, we demonstrate that our proposed method, PriorCVAE, is agnostic to the nature of the models which it approximates, and can be used, for instance, to encode solutions of ODEs. It provides a practical tool for approximate inference and shows potential in real-life spatial and spatiotemporal applications.
LGJun 6, 2023
Memory-Based Dual Gaussian Processes for Sequential LearningPaul E. Chang, Prakhar Verma, S. T. John et al.
Sequential learning with Gaussian processes (GPs) is challenging when access to past data is limited, for example, in continual and active learning. In such cases, errors can accumulate over time due to inaccuracies in the posterior, hyperparameters, and inducing points, making accurate learning challenging. Here, we present a method to keep all such errors in check using the recently proposed dual sparse variational GP. Our method enables accurate inference for generic likelihoods and improves learning by actively building and updating a memory of past data. We demonstrate its effectiveness in several applications involving Bayesian optimization, active learning, and continual learning.
LGJun 3, 2023
Variational Gaussian Process Diffusion ProcessesPrakhar Verma, Vincent Adam, Arno Solin
Diffusion processes are a class of stochastic differential equations (SDEs) providing a rich family of expressive models that arise naturally in dynamic modelling tasks. Probabilistic inference and learning under generative models with latent processes endowed with a non-linear diffusion process prior are intractable problems. We build upon work within variational inference, approximating the posterior process as a linear diffusion process, and point out pathologies in the approach. We propose an alternative parameterization of the Gaussian variational process using a site-based exponential family description. This allows us to trade a slow inference algorithm with fixed-point iterations for a fast algorithm for convex optimization akin to natural gradient descent, which also provides a better objective for learning model parameters.
LGNov 2, 2022
Fantasizing with Dual GPs in Bayesian Optimization and Active LearningPaul E. Chang, Prakhar Verma, ST John et al.
Gaussian processes (GPs) are the main surrogate functions used for sequential modelling such as Bayesian Optimization and Active Learning. Their drawbacks are poor scaling with data and the need to run an optimization loop when using a non-Gaussian likelihood. In this paper, we focus on `fantasizing' batch acquisition functions that need the ability to condition on new fantasized data computationally efficiently. By using a sparse Dual GP parameterization, we gain linear scaling with batch size as well as one-step updates for non-Gaussian likelihoods, thus extending sparse models to greedy batch fantasizing acquisition functions.
CLOct 28, 2024
Plan*RAG: Efficient Test-Time Planning for Retrieval Augmented GenerationPrakhar Verma, Sukruta Prakash Midigeshi, Gaurav Sinha et al.
We introduce Plan*RAG, a novel framework that enables structured multi-hop reasoning in retrieval-augmented generation (RAG) through test-time reasoning plan generation. While existing approaches such as ReAct maintain reasoning chains within the language model's context window, we observe that this often leads to plan fragmentation and execution failures. Our key insight is that by isolating the reasoning plan as a directed acyclic graph (DAG) outside the LM's working memory, we can enable (1) systematic exploration of reasoning paths, (2) atomic subqueries enabling precise retrievals and grounding, and (3) efficiency through parallel execution and bounded context window utilization. Moreover, Plan*RAG's modular design allows it to be integrated with existing RAG methods, thus providing a practical solution to improve current RAG systems. On standard multi-hop reasoning benchmarks, Plan*RAG consistently achieves improvements over recently proposed methods such as RQ-RAG and Self-RAG, while maintaining comparable computational costs.
LGJun 19, 2025
Think Global, Act Local: Bayesian Causal Discovery with Language Models in Sequential DataPrakhar Verma, David Arbour, Sunav Choudhary et al.
Causal discovery from observational data typically assumes full access to data and availability of domain experts. In practice, data often arrive in batches, and expert knowledge is scarce. Language Models (LMs) offer a surrogate but come with their own issues-hallucinations, inconsistencies, and bias. We present BLANCE (Bayesian LM-Augmented Causal Estimation)-a hybrid Bayesian framework that bridges these gaps by adaptively integrating sequential batch data with LM-derived noisy, expert knowledge while accounting for both data-induced and LM-induced biases. Our proposed representation shift from Directed Acyclic Graph (DAG) to Partial Ancestral Graph (PAG) accommodates ambiguities within a coherent Bayesian framework, allowing grounding the global LM knowledge in local observational data. To guide LM interaction, we use a sequential optimization scheme that adaptively queries the most informative edges. Across varied datasets, BLANCE outperforms prior work in structural accuracy and extends to Bayesian parameter estimation, showing robustness to LM noise.
LGOct 29, 2021
Scalable Inference in SDEs by Direct Matching of the Fokker-Planck-Kolmogorov EquationArno Solin, Ella Tamir, Prakhar Verma
Simulation-based techniques such as variants of stochastic Runge-Kutta are the de facto approach for inference with stochastic differential equations (SDEs) in machine learning. These methods are general-purpose and used with parametric and non-parametric models, and neural SDEs. Stochastic Runge-Kutta relies on the use of sampling schemes that can be inefficient in high dimensions. We address this issue by revisiting the classical SDE literature and derive direct approximations to the (typically intractable) Fokker-Planck-Kolmogorov equation by matching moments. We show how this workflow is fast, scales to high-dimensional latent spaces, and is applicable to scarce-data applications, where a non-parametric SDE with a driving Gaussian process velocity field specifies the model.