AIAug 15, 2023
Brain-inspired Computational Intelligence via Predictive CodingTommaso Salvatori, Ankur Mali, Christopher L. Buckley et al. · uw
Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with a learning algorithm called error backpropagation, always considered biologically implausible. To this end, recent works have studied learning algorithms for deep neural networks inspired by the neurosciences. One such theory, called predictive coding (PC), has shown promising properties that make it potentially valuable for the machine learning community: it can model information processing in different areas of the brain, can be used in control and robotics, has a solid mathematical foundation in variational inference, and performs its computations asynchronously. Inspired by such properties, works that propose novel PC-like algorithms are starting to be present in multiple sub-fields of machine learning and AI at large. Here, we survey such efforts by first providing a broad overview of the history of PC to provide common ground for the understanding of the recent developments, then by describing current efforts and results, and concluding with a large discussion of possible implications and ways forward.
MLMar 20, 2022
Geometric Methods for Sampling, Optimisation, Inference and Adaptive AgentsAlessandro Barp, Lancelot Da Costa, Guilherme França et al.
In this chapter, we identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making. Based on this identification, we derive algorithms that exploit these geometric structures to solve these problems efficiently. We show that a wide range of geometric theories emerge naturally in these fields, ranging from measure-preserving processes, information divergences, Poisson geometry, and geometric integration. Specifically, we explain how (i) leveraging the symplectic geometry of Hamiltonian systems enable us to construct (accelerated) sampling and optimisation methods, (ii) the theory of Hilbertian subspaces and Stein operators provides a general methodology to obtain robust estimators, (iii) preserving the information geometry of decision-making yields adaptive agents that perform active inference. Throughout, we emphasise the rich connections between these fields; e.g., inference draws on sampling and optimisation, and adaptive decision-making assesses decisions by inferring their counterfactual consequences. Our exposition provides a conceptual overview of underlying ideas, rather than a technical discussion, which can be found in the references herein.
ROAug 15, 2023
Hierarchical generative modelling for autonomous robotsKai Yuan, Noor Sajid, Karl Friston et al.
Humans can produce complex whole-body motions when interacting with their surroundings, by planning, executing and combining individual limb movements. We investigated this fundamental aspect of motor control in the setting of autonomous robotic operations. We approach this problem by hierarchical generative modelling equipped with multi-level planning-for autonomous task completion-that mimics the deep temporal architecture of human motor control. Here, temporal depth refers to the nested time scales at which successive levels of a forward or generative model unfold, for example, delivering an object requires a global plan to contextualise the fast coordination of multiple local movements of limbs. This separation of temporal scales also motivates robotics and control. Specifically, to achieve versatile sensorimotor control, it is advantageous to hierarchically structure the planning and low-level motor control of individual limbs. We use numerical and physical simulation to conduct experiments and to establish the efficacy of this formulation. Using a hierarchical generative model, we show how a humanoid robot can autonomously complete a complex task that necessitates a holistic use of locomotion, manipulation, and grasping. Specifically, we demonstrate the ability of a humanoid robot that can retrieve and transport a box, open and walk through a door to reach the destination, approach and kick a football, while showing robust performance in presence of body damage and ground irregularities. Our findings demonstrated the effectiveness of using human-inspired motor control algorithms, and our method provides a viable hierarchical architecture for the autonomous completion of challenging goal-directed tasks.
AIJun 6, 2023
Designing explainable artificial intelligence with active inference: A framework for transparent introspection and decision-makingMahault Albarracin, Inês Hipólito, Safae Essafi Tremblay et al.
This paper investigates the prospect of developing human-interpretable, explainable artificial intelligence (AI) systems based on active inference and the free energy principle. We first provide a brief overview of active inference, and in particular, of how it applies to the modeling of decision-making, introspection, as well as the generation of overt and covert actions. We then discuss how active inference can be leveraged to design explainable AI systems, namely, by allowing us to model core features of ``introspective'' processes and by generating useful, human-interpretable models of the processes involved in decision-making. We propose an architecture for explainable AI systems using active inference. This architecture foregrounds the role of an explicit hierarchical generative model, the operation of which enables the AI system to track and explain the factors that contribute to its own decisions, and whose structure is designed to be interpretable and auditable by human users. We outline how this architecture can integrate diverse sources of information to make informed decisions in an auditable manner, mimicking or reproducing aspects of human-like consciousness and introspection. Finally, we discuss the implications of our findings for future research in AI, and the potential ethical considerations of developing AI systems with (the appearance of) introspective capabilities.
CLOct 27, 2022
Natural Language Syntax Complies with the Free-Energy PrincipleElliot Murphy, Emma Holmes, Karl Friston
Natural language syntax yields an unbounded array of hierarchically structured expressions. We claim that these are used in the service of active inference in accord with the free-energy principle (FEP). While conceptual advances alongside modelling and simulation work have attempted to connect speech segmentation and linguistic communication with the FEP, we extend this program to the underlying computations responsible for generating syntactic objects. We argue that recently proposed principles of economy in language design - such as "minimal search" criteria from theoretical syntax - adhere to the FEP. This affords a greater degree of explanatory power to the FEP - with respect to higher language functions - and offers linguistics a grounding in first principles with respect to computability. We show how both tree-geometric depth and a Kolmogorov complexity estimate (recruiting a Lempel-Ziv compression algorithm) can be used to accurately predict legal operations on syntactic workspaces, directly in line with formulations of variational free energy minimization. This is used to motivate a general principle of language design that we term Turing-Chomsky Compression (TCC). We use TCC to align concerns of linguists with the normative account of self-organization furnished by the FEP, by marshalling evidence from theoretical linguistics and psycholinguistics to ground core principles of efficient syntactic computation within active inference.
LGJul 27, 2024
From pixels to planning: scale-free active inferenceKarl Friston, Conor Heins, Tim Verbelen et al.
This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalising generative models (RGM) can be regarded as discrete homologues of deep convolutional neural networks or continuous state-space models in generalised coordinates of motion. By construction, these scale-invariant models can be used to learn compositionality over space and time, furnishing models of paths or orbits; i.e., events of increasing temporal depth and itinerancy. This technical note illustrates the automatic discovery, learning and deployment of RGMs using a series of applications. We start with image classification and then consider the compression and generation of movies and music. Finally, we apply the same variational principles to the learning of Atari-like games.
LGMar 9, 2023
Curvature-Sensitive Predictive Coding with Approximate Laplace Monte CarloUmais Zahid, Qinghai Guo, Karl Friston et al.
Predictive coding (PC) accounts of perception now form one of the dominant computational theories of the brain, where they prescribe a general algorithm for inference and learning over hierarchical latent probabilistic models. Despite this, they have enjoyed little export to the broader field of machine learning, where comparative generative modelling techniques have flourished. In part, this has been due to the poor performance of models trained with PC when evaluated by both sample quality and marginal likelihood. By adopting the perspective of PC as a variational Bayes algorithm under the Laplace approximation, we identify the source of these deficits to lie in the exclusion of an associated Hessian term in the PC objective function, which would otherwise regularise the sharpness of the probability landscape and prevent over-certainty in the approximate posterior. To remedy this, we make three primary contributions: we begin by suggesting a simple Monte Carlo estimated evidence lower bound which relies on sampling from the Hessian-parameterised variational posterior. We then derive a novel block diagonal approximation to the full Hessian matrix that has lower memory requirements and favourable mathematical properties. Lastly, we present an algorithm that combines our method with standard PC to reduce memory complexity further. We evaluate models trained with our approach against the standard PC framework on image benchmark datasets. Our approach produces higher log-likelihoods and qualitatively better samples that more closely capture the diversity of the data-generating distribution.
NEJul 14, 2023
Brain in the Dark: Design Principles for Neuro-mimetic Learning and InferenceMehran H. Bazargani, Szymon Urbas, Karl Friston
Even though the brain operates in pure darkness, within the skull, it can infer the most likely causes of its sensory input. An approach to modelling this inference is to assume that the brain has a generative model of the world, which it can invert to infer the hidden causes behind its sensory stimuli, that is, perception. This assumption raises key questions: how to formulate the problem of designing brain-inspired generative models, how to invert them for the tasks of inference and learning, what is the appropriate loss function to be optimised, and, most importantly, what are the different choices of mean field approximation (MFA) and their implications for variational inference (VI).
AISep 30, 2024
Possible Principles for Aligned Structure Learning AgentsLancelot Da Costa, Tomáš Gavenčiak, David Hyland et al.
This paper offers a roadmap for the development of scalable aligned artificial intelligence (AI) from first principle descriptions of natural intelligence. In brief, a possible path toward scalable aligned AI rests upon enabling artificial agents to learn a good model of the world that includes a good model of our preferences. For this, the main objective is creating agents that learn to represent the world and other agents' world models; a problem that falls under structure learning (a.k.a. causal representation learning or model discovery). We expose the structure learning and alignment problems with this goal in mind, as well as principles to guide us forward, synthesizing various ideas across mathematics, statistics, and cognitive science. 1) We discuss the essential role of core knowledge, information geometry and model reduction in structure learning, and suggest core structural modules to learn a wide range of naturalistic worlds. 2) We outline a way toward aligned agents through structure learning and theory of mind. As an illustrative example, we mathematically sketch Asimov's Laws of Robotics, which prescribe agents to act cautiously to minimize the ill-being of other agents. We supplement this example by proposing refined approaches to alignment. These observations may guide the development of artificial intelligence in helping to scale existing -- or design new -- aligned structure learning systems.
AIJan 11, 2022Code
pymdp: A Python library for active inference in discrete state spacesConor Heins, Beren Millidge, Daphne Demekas et al.
Active inference is an account of cognition and behavior in complex systems which brings together action, perception, and learning under the theoretical mantle of Bayesian inference. Active inference has seen growing applications in academic research, especially in fields that seek to model human or animal behavior. While in recent years, some of the code arising from the active inference literature has been written in open source languages like Python and Julia, to-date, the most popular software for simulating active inference agents is the DEM toolbox of SPM, a MATLAB library originally developed for the statistical analysis and modelling of neuroimaging data. Increasing interest in active inference, manifested both in terms of sheer number as well as diversifying applications across scientific disciplines, has thus created a need for generic, widely-available, and user-friendly code for simulating active inference in open-source scientific computing languages like Python. The Python package we present here, pymdp (see https://github.com/infer-actively/pymdp), represents a significant step in this direction: namely, we provide the first open-source package for simulating active inference with partially-observable Markov Decision Processes or POMDPs. We review the package's structure and explain its advantages like modular design and customizability, while providing in-text code blocks along the way to demonstrate how it can be used to build and run active inference processes with ease. We developed pymdp to increase the accessibility and exposure of the active inference framework to researchers, engineers, and developers with diverse disciplinary backgrounds. In the spirit of open-source software, we also hope that it spurs new innovation, development, and collaboration in the growing active inference community.
MLMay 4
Online Generalised Predictive CodingMehran H. Z. Bazargani, Szymon Urbas, Adeel Razi et al.
This paper introduces an extension of generalised filtering for online applications. Generalised filtering refers to data assimilation schemes that jointly infer latent states, learn unknown model parameters, and estimate uncertainty in an integrated framework -- e.g., estimate state and observation noise -- at the same time (i.e., triple estimation). This framework appears across disciplines under different names, including variational Kalman-Bucy filtering in engineering, generalised predictive coding in neuroscience, and Dynamic Expectation Maximisation (DEM) in time-series analysis. Here, we specialise DEM for ``online'' data assimilation, through a separation of temporal scales. We describe the variational principles and procedures that allow one to assimilate data in a way that allows for a slow updating of parameters and precisions, which contextualise fast Bayesian belief updating about the dynamic hidden states. Using numerical studies, we demonstrate the validity of online DEM (ODEM) using a non-linear -- and potentially chaotic -- generative model, to show that the ODEM scheme can track the latent states of the generative process, even when its functional form differs fundamentally from the dynamics of the generative model. Framed from a neuro-mimetic predictive coding perspective, ODEM offers a biologically inspired solution to online inference, learning, and uncertainty estimation in dynamic environments.
AIApr 25
Active Inference: A method for Phenotyping Agency in AI systems?Philip Wilson, Axel Constant, Mahault Albarracin et al.
The proliferation of agentic artificial intelligence has outpaced the conceptual tools needed to characterize agency in computational systems. Prevailing definitions mainly rely on autonomy and goal-directedness. Here, we argue for a minimal notion open to principled inspection given three criteria: intentionality as action grounded in beliefs and desires, rationality as normatively coherent action entailed by a world model, and explainability as action causally traceable to internal states; we subsequently instantiate these as a partially observable Markov decision process under a variational framework wherein posterior beliefs, prior preferences, and the minimization of expected free energy jointly constitute an agentic action chain. Using a canonical T-maze paradigm, we evidence how empowerment, formulated as the channel capacity between actions and anticipated observations, serves as an operational metric that distinguishes zero-, intermediate-, and high-agency phenotypes through structural manipulations of the generative model. We conclude by arguing that as agents engage in epistemic foraging to resolve ambiguity, the governance controls that remain effective must shift systematically from external constraints to the internal modulation of prior preferences, offering a principled, variational bridge from computational phenotyping to AI governance strategy
NCMay 28, 2025
Self-orthogonalizing attractor neural networks emerging from the free energy principleTamas Spisak, Karl Friston
Attractor dynamics are a hallmark of many complex systems, including the brain. Understanding how such self-organizing dynamics emerge from first principles is crucial for advancing our understanding of neuronal computations and the design of artificial intelligence systems. Here we formalize how attractor networks emerge from the free energy principle applied to a universal partitioning of random dynamical systems. Our approach obviates the need for explicitly imposed learning and inference rules and identifies emergent, but efficient and biologically plausible inference and learning dynamics for such self-organizing systems. These result in a collective, multi-level Bayesian active inference process. Attractors on the free energy landscape encode prior beliefs; inference integrates sensory data into posterior beliefs; and learning fine-tunes couplings to minimize long-term surprise. Analytically and via simulations, we establish that the proposed networks favor approximately orthogonalized attractor representations, a consequence of simultaneously optimizing predictive accuracy and model complexity. These attractors efficiently span the input subspace, enhancing generalization and the mutual information between hidden causes and observable effects. Furthermore, while random data presentation leads to symmetric and sparse couplings, sequential data fosters asymmetric couplings and non-equilibrium steady-state dynamics, offering a natural extension to conventional Boltzmann Machines. Our findings offer a unifying theory of self-organizing attractor networks, providing novel insights for AI and neuroscience.
AIMay 30, 2025
AXIOM: Learning to Play Games in Minutes with Expanding Object-Centric ModelsConor Heins, Toon Van de Maele, Alexander Tschantz et al.
Current deep reinforcement learning (DRL) approaches achieve state-of-the-art performance in various domains, but struggle with data efficiency compared to human learning, which leverages core priors about objects and their interactions. Active inference offers a principled framework for integrating sensory information with prior knowledge to learn a world model and quantify the uncertainty of its own beliefs and predictions. However, active inference models are usually crafted for a single task with bespoke knowledge, so they lack the domain flexibility typical of DRL approaches. To bridge this gap, we propose a novel architecture that integrates a minimal yet expressive set of core priors about object-centric dynamics and interactions to accelerate learning in low-data regimes. The resulting approach, which we call AXIOM, combines the usual data efficiency and interpretability of Bayesian approaches with the across-task generalization usually associated with DRL. AXIOM represents scenes as compositions of objects, whose dynamics are modeled as piecewise linear trajectories that capture sparse object-object interactions. The structure of the generative model is expanded online by growing and learning mixture models from single events and periodically refined through Bayesian model reduction to induce generalization. AXIOM masters various games within only 10,000 interaction steps, with both a small number of parameters compared to DRL, and without the computational expense of gradient-based optimization.
NEMar 22, 2025
Meta-Representational Predictive Coding: Biomimetic Self-Supervised LearningAlexander Ororbia, Karl Friston, Rajesh P. N. Rao
Self-supervised learning has become an increasingly important paradigm in the domain of machine intelligence. Furthermore, evidence for self-supervised adaptation, such as contrastive formulations, has emerged in recent computational neuroscience and brain-inspired research. Nevertheless, current work on self-supervised learning relies on biologically implausible credit assignment -- in the form of backpropagation of errors -- and feedforward inference, typically a forward-locked pass. Predictive coding, in its mechanistic form, offers a biologically plausible means to sidestep these backprop-specific limitations. However, unsupervised predictive coding rests on learning a generative model of raw pixel input (akin to ``generative AI'' approaches), which entails predicting a potentially high dimensional input; on the other hand, supervised predictive coding, which learns a mapping between inputs to target labels, requires human annotation, and thus incurs the drawbacks of supervised learning. In this work, we present a scheme for self-supervised learning within a neurobiologically plausible framework that appeals to the free energy principle, constructing a new form of predictive coding that we call meta-representational predictive coding (MPC). MPC sidesteps the need for learning a generative model of sensory input (e.g., pixel-level features) by learning to predict representations of sensory input across parallel streams, resulting in an encoder-only learning and inference scheme. This formulation rests on active inference (in the form of sensory glimpsing) to drive the learning of representations, i.e., the representational dynamics are driven by sequences of decisions made by the model to sample informative portions of its sensorium.
AIMar 17, 2025
Robust Decision-Making Via Free Energy MinimizationAllahkaram Shafiei, Hozefa Jesawada, Karl Friston et al.
Despite their groundbreaking performance, state-of-the-art autonomous agents can misbehave when training and environmental conditions become inconsistent, with minor mismatches leading to undesirable behaviors or even catastrophic failures. Robustness towards these training/environment ambiguities is a core requirement for intelligent agents and its fulfillment is a long-standing challenge when deploying agents in the real world. Here, we introduce a Distributionally Robust Free Energy model (DR-FREE) that instills this core property by design. It directly wires robustness into the agent decision-making mechanisms via free energy minimization. By combining a robust extension of the free energy principle with a novel resolution engine, DR-FREE returns a policy that is optimal-yet-robust against ambiguity. The policy has an explicit, soft-max, structure that reveals the mechanistic role of ambiguity on optimal decisions and requisite Bayesian belief updating. We evaluate DR-FREE on an experimental testbed involving real rovers navigating an ambiguous environment filled with obstacles. Across all the experiments, DR-FREE enables robots to successfully navigate towards their goal even when, in contrast, state-of-the-art free energy models fail. In short, DR-FREE can tackle scenarios that elude previous methods: this milestone may inspire both deployment in multi-agent settings and, at a perhaps deeper level, the quest for a biologically plausible explanation of how natural agents -- with little or no training -- survive in capricious environments.
SYDec 14, 2025
EcoNet: Multiagent Planning and Control Of Household Energy Resources Using Active InferenceJohn C. Boik, Kobus Esterhuysen, Jacqueline B. Hynes et al.
Advances in automated systems afford new opportunities for intelligent management of energy at household, local area, and utility scales. Home Energy Management Systems (HEMS) can play a role by optimizing the schedule and use of household energy devices and resources. One challenge is that the goals of a household can be complex and conflicting. For example, a household might wish to reduce energy costs and grid-associated greenhouse gas emissions, yet keep room temperatures comfortable. Another challenge is that an intelligent HEMS agent must make decisions under uncertainty. An agent must plan actions into the future, but weather and solar generation forecasts, for example, provide inherently uncertain estimates of future conditions. This paper introduces EcoNet, a Bayesian approach to household and neighborhood energy management that is based on active inference. The aim is to improve energy management and coordination, while accommodating uncertainties and taking into account potentially conditional and conflicting goals and preferences. Simulation results are presented and discussed.
LGMar 29, 2024
Localising the Seizure Onset Zone from Single-Pulse Electrical Stimulation Responses with a CNN TransformerJamie Norris, Aswin Chari, Dorien van Blooijs et al.
Epilepsy is one of the most common neurological disorders, often requiring surgical intervention when medication fails to control seizures. For effective surgical outcomes, precise localisation of the epileptogenic focus - often approximated through the Seizure Onset Zone (SOZ) - is critical yet remains a challenge. Active probing through electrical stimulation is already standard clinical practice for identifying epileptogenic areas. Our study advances the application of deep learning for SOZ localisation using Single-Pulse Electrical Stimulation (SPES) responses, with two key contributions. Firstly, we implement an existing deep learning model to compare two SPES analysis paradigms: divergent and convergent. These paradigms evaluate outward and inward effective connections, respectively. We assess the generalisability of these models to unseen patients and electrode placements using held-out test sets. Our findings reveal a notable improvement in moving from a divergent (AUROC: 0.574) to a convergent approach (AUROC: 0.666), marking the first application of the latter in this context. Secondly, we demonstrate the efficacy of CNN Transformers with cross-channel attention in handling heterogeneous electrode placements, increasing the AUROC to 0.730. These findings represent a significant step in modelling patient-specific intracranial EEG electrode placements in SPES. Future work will explore integrating these models into clinical decision-making processes to bridge the gap between deep learning research and practical healthcare applications.
MLSep 21, 2021
Active inference, Bayesian optimal design, and expected utilityNoor Sajid, Lancelot Da Costa, Thomas Parr et al.
Active inference, a corollary of the free energy principle, is a formal way of describing the behavior of certain kinds of random dynamical systems that have the appearance of sentience. In this chapter, we describe how active inference combines Bayesian decision theory and optimal Bayesian design principles under a single imperative to minimize expected free energy. It is this aspect of active inference that allows for the natural emergence of information-seeking behavior. When removing prior outcomes preferences from expected free energy, active inference reduces to optimal Bayesian design, i.e., information gain maximization. Conversely, active inference reduces to Bayesian decision theory in the absence of ambiguity and relative risk, i.e., expected utility maximization. Using these limiting cases, we illustrate how behaviors differ when agents select actions that optimize expected utility, expected information gain, and expected free energy. Our T-maze simulations show optimizing expected free energy produces goal-directed information-seeking behavior while optimizing expected utility induces purely exploitive behavior and maximizing information gain engenders intrinsically motivated behavior.
NCJul 12, 2021
Bayesian brains and the Rényi divergenceNoor Sajid, Francesco Faccio, Lancelot Da Costa et al.
Under the Bayesian brain hypothesis, behavioural variations can be attributed to different priors over generative model parameters. This provides a formal explanation for why individuals exhibit inconsistent behavioural preferences when confronted with similar choices. For example, greedy preferences are a consequence of confident (or precise) beliefs over certain outcomes. Here, we offer an alternative account of behavioural variability using Rényi divergences and their associated variational bounds. Rényi bounds are analogous to the variational free energy (or evidence lower bound) and can be derived under the same assumptions. Importantly, these bounds provide a formal way to establish behavioural differences through an $α$ parameter, given fixed priors. This rests on changes in $α$ that alter the bound (on a continuous scale), inducing different posterior estimates and consequent variations in behaviour. Thus, it looks as if individuals have different priors, and have reached different conclusions. More specifically, $α\to 0^{+}$ optimisation leads to mass-covering variational estimates and increased variability in choice behaviour. Furthermore, $α\to + \infty$ optimisation leads to mass-seeking variational posteriors and greedy preferences. We exemplify this formulation through simulations of the multi-armed bandit task. We note that these $α$ parameterisations may be especially relevant, i.e., shape preferences, when the true posterior is not in the same family of distributions as the assumed (simpler) approximate density, which may be the case in many real-world scenarios. The ensuing departure from vanilla variational inference provides a potentially useful explanation for differences in behavioural preferences of biological (or artificial) agents under the assumption that the brain performs variational Bayesian inference.
AIJun 8, 2021
Exploration and preference satisfaction trade-off in reward-free learningNoor Sajid, Panagiotis Tigas, Alexey Zakharov et al.
Biological agents have meaningful interactions with their environment despite the absence of immediate reward signals. In such instances, the agent can learn preferred modes of behaviour that lead to predictable states -- necessary for survival. In this paper, we pursue the notion that this learnt behaviour can be a consequence of reward-free preference learning that ensures an appropriate trade-off between exploration and preference satisfaction. For this, we introduce a model-based Bayesian agent equipped with a preference learning mechanism (pepper) using conjugate priors. These conjugate priors are used to augment the expected free energy planner for learning preferences over states (or outcomes) across time. Importantly, our approach enables the agent to learn preferences that encourage adaptive behaviour at test time. We illustrate this in the OpenAI Gym FrozenLake and the 3D mini-world environments -- with and without volatility. Given a constant environment, these agents learn confident (i.e., precise) preferences and act to satisfy them. Conversely, in a volatile setting, perpetual preference uncertainty maintains exploratory behaviour. Our experiments suggest that learnable (reward-free) preferences entail a trade-off between exploration and preference satisfaction. Pepper offers a straightforward framework suitable for designing adaptive agents when reward functions cannot be predefined as in real environments.
AIMar 25, 2021
Active Inference Tree Search in Large POMDPsDomenico Maisto, Francesco Gregoretti, Karl Friston et al.
The ability to plan ahead efficiently is key for both living organisms and artificial systems. Model-based planning and prospection are widely studied in cognitive neuroscience and artificial intelligence (AI), but from different perspectives--and with different desiderata in mind (biological realism versus scalability) that are difficult to reconcile. Here, we introduce a novel method to plan in POMDPs--Active Inference Tree Search (AcT)--that combines the normative character and biological realism of a leading planning theory in neuroscience (Active Inference) and the scalability of tree search methods in AI. This unification enhances both approaches. On the one hand, tree searches enable the biologically grounded, first principle method of active inference to be applied to large-scale problems. On the other hand, active inference provides a principled solution to the exploration-exploitation dilemma, which is often addressed heuristically in tree search methods. Our simulations show that AcT successfully navigates binary trees that are challenging for sampling-based methods, problems that require adaptive exploration, and the large POMDP problem 'RockSample'--in which AcT reproduces state-of-the-art POMDP solutions. Furthermore, we illustrate how AcT can be used to simulate neurophysiological responses (e.g., in the hippocampus and prefrontal cortex) of humans and other animals that solve large planning problems. These numerical analyses show that Active Tree Search is a principled realisation of neuroscientific and AI planning theories, which offer both biological realism and scalability.
AISep 17, 2020
Reward Maximisation through Discrete Active InferenceLancelot Da Costa, Noor Sajid, Thomas Parr et al.
Active inference is a probabilistic framework for modelling the behaviour of biological and artificial agents, which derives from the principle of minimising free energy. In recent years, this framework has successfully been applied to a variety of situations where the goal was to maximise reward, offering comparable and sometimes superior performance to alternative approaches. In this paper, we clarify the connection between reward maximisation and active inference by demonstrating how and when active inference agents perform actions that are optimal for maximising reward. Precisely, we show the conditions under which active inference produces the optimal solution to the Bellman equation--a formulation that underlies several approaches to model-based reinforcement learning and control. On partially observed Markov decision processes, the standard active inference scheme can produce Bellman optimal actions for planning horizons of 1, but not beyond. In contrast, a recently developed recursive active inference scheme (sophisticated inference) can produce Bellman optimal actions on any finite temporal horizon. We append the analysis with a discussion of the broader relationship between active inference and reinforcement learning.
AISep 3, 2020
Action and Perception as Divergence MinimizationDanijar Hafner, Pedro A. Ortega, Jimmy Ba et al.
To learn directed behaviors in complex environments, intelligent agents need to optimize objective functions. Various objectives are known for designing artificial agents, including task rewards and intrinsic motivation. However, it is unclear how the known objectives relate to each other, which objectives remain yet to be discovered, and which objectives better describe the behavior of humans. We introduce the Action Perception Divergence (APD), an approach for categorizing the space of possible objective functions for embodied agents. We show a spectrum that reaches from narrow to general objectives. While the narrow objectives correspond to domain-specific rewards as typical in reinforcement learning, the general objectives maximize information with the environment through latent variable models of input sequences. Intuitively, these agents use perception to align their beliefs with the world and use actions to align the world with their beliefs. They infer representations that are informative of past inputs, explore future inputs that are informative of their representations, and select actions or skills that maximally influence future inputs. This explains a wide range of unsupervised objectives from a single principle, including representation learning, information gain, empowerment, and skill discovery. Our findings suggest leveraging powerful world models for unsupervised exploration as a path toward highly adaptive agents that seek out large niches in their environments, rendering task rewards optional.
NCJun 7, 2020
Deep active inference agents using Monte-Carlo methodsZafeirios Fountas, Noor Sajid, Pedro A. M. Mediano et al.
Active inference is a Bayesian framework for understanding biological intelligence. The underlying theory brings together perception and action under one single imperative: minimizing free energy. However, despite its theoretical utility in explaining intelligence, computational implementations have been restricted to low-dimensional and idealized situations. In this paper, we present a neural architecture for building deep active inference agents operating in complex, continuous state-spaces using multiple forms of Monte-Carlo (MC) sampling. For this, we introduce a number of techniques, novel to active inference. These include: i) selecting free-energy-optimal policies via MC tree search, ii) approximating this optimal policy distribution via a feed-forward `habitual' network, iii) predicting future parameter belief updates using MC dropouts and, finally, iv) optimizing state transition precision (a high-end form of attention). Our approach enables agents to learn environmental dynamics efficiently, while maintaining task performance, in relation to reward-based counterparts. We illustrate this in a new toy environment, based on the dSprites data-set, and demonstrate that active inference agents automatically create disentangled representations that are apt for modeling state transitions. In a more complex Animal-AI environment, our agents (using the same neural architecture) are able to simulate future state transitions and actions (i.e., plan), to evince reward-directed navigation - despite temporary suspension of visual input. These results show that deep active inference - equipped with MC methods - provides a flexible framework to develop biologically-inspired intelligent agents, with applications in both machine learning and cognitive science.
NCJun 7, 2020
Sophisticated InferenceKarl Friston, Lancelot Da Costa, Danijar Hafner et al.
Active inference offers a first principle account of sentient behaviour, from which special and important cases can be derived, e.g., reinforcement learning, active learning, Bayes optimal inference, Bayes optimal design, etc. Active inference resolves the exploitation-exploration dilemma in relation to prior preferences, by placing information gain on the same footing as reward or value. In brief, active inference replaces value functions with functionals of (Bayesian) beliefs, in the form of an expected (variational) free energy. In this paper, we consider a sophisticated kind of active inference, using a recursive form of expected free energy. Sophistication describes the degree to which an agent has beliefs about beliefs. We consider agents with beliefs about the counterfactual consequences of action for states of affairs and beliefs about those latent states. In other words, we move from simply considering beliefs about 'what would happen if I did that' to 'what would I believe about what would happen if I did that'. The recursive form of the free energy functional effectively implements a deep tree search over actions and outcomes in the future. Crucially, this search is over sequences of belief states, as opposed to states per se. We illustrate the competence of this scheme, using numerical simulations of deep decision problems.
MLMay 20, 2017
Bayesian Belief Updating of Spatiotemporal Seizure DynamicsGerald K Cooray, Richard Rosch, Torsten Baldeweg et al.
Epileptic seizure activity shows complicated dynamics in both space and time. To understand the evolution and propagation of seizures spatially extended sets of data need to be analysed. We have previously described an efficient filtering scheme using variational Laplace that can be used in the Dynamic Causal Modelling (DCM) framework [Friston, 2003] to estimate the temporal dynamics of seizures recorded using either invasive or non-invasive electrical recordings (EEG/ECoG). Spatiotemporal dynamics are modelled using a partial differential equation -- in contrast to the ordinary differential equation used in our previous work on temporal estimation of seizure dynamics [Cooray, 2016]. We provide the requisite theoretical background for the method and test the ensuing scheme on simulated seizure activity data and empirical invasive ECoG data. The method provides a framework to assimilate the spatial and temporal dynamics of seizure activity, an aspect of great physiological and clinical importance.
NCMay 17, 2017
Approximate Bayesian inference as a gauge theoryBiswa Sengupta, Karl Friston
In a published paper [Sengupta, 2016], we have proposed that the brain (and other self-organized biological and artificial systems) can be characterized via the mathematical apparatus of a gauge theory. The picture that emerges from this approach suggests that any biological system (from a neuron to an organism) can be cast as resolving uncertainty about its external milieu, either by changing its internal states or its relationship to the environment. Using formal arguments, we have shown that a gauge theory for neuronal dynamics -- based on approximate Bayesian inference -- has the potential to shed new light on phenomena that have thus far eluded a formal description, such as attention and the link between action and perception. Here, we describe the technical apparatus that enables such a variational inference on manifolds. Particularly, the novel contribution of this paper is an algorithm that utlizes a Schild's ladder for parallel transport of sufficient statistics (means, covariances, etc.) on a statistical manifold.