PMJun 12, 2022
Deep Reinforcement Learning for Optimal Investment and Saving Strategy Selection in Heterogeneous Profiles: Intelligent Agents working towards retirementFatih Ozhamaratli, Paolo Barucca
The transition from defined benefit to defined contribution pension plans shifts the responsibility for saving toward retirement from governments and institutions to the individuals. Determining optimal saving and investment strategy for individuals is paramount for stable financial stance and for avoiding poverty during work-life and retirement, and it is a particularly challenging task in a world where form of employment and income trajectory experienced by different occupation groups are highly diversified. We introduce a model in which agents learn optimal portfolio allocation and saving strategies that are suitable for their heterogeneous profiles. We use deep reinforcement learning to train agents. The environment is calibrated with occupation and age dependent income evolution dynamics. The research focuses on heterogeneous income trajectories dependent on agent profiles and incorporates the behavioural parameterisation of agents. The model provides a flexible methodology to estimate lifetime consumption and investment choices for heterogeneous profiles under varying scenarios.
SIApr 15
Maximum entropy temporal networksPaolo Barucca
Temporal networks consist of timestamped directed interactions that may appear continuously in time, yet few studies have directly tackled the continuous-time modeling of networks. Here, we introduce a maximum-entropy approach to temporal networks and with basic assumptions on constraints, the corresponding network ensembles admit a modular and interpretable representation: a set of global time processes and a static maximum-entropy edge, e.g. node pair, probability. This time-edge labels factorization yields closed-form log-likelihoods, degree, clustering and motif expectations, and yields a whole class of effective generative models. We provide the maximum-entropy derivation for the non-homogeneous Poisson Process (NHPP) intensities governing the probability of directed edges in temporal networks via the functional optimization over path entropy, connecting NHPP modeling to maximum-entropy network ensembles. NHPPs consistently improve log-likelihood over generic Poisson processes, while the maximum-entropy edge labels recover strength constraints and reproduce expected unique-degree curves. We discuss the limitations of this framework and how it can be integrated with multivariate Hawkes calibration procedures, renewal theory, and neural kernel estimation in graph neural networks.
STApr 11, 2022
Variational Heteroscedastic Volatility ModelZexuan Yin, Paolo Barucca
We propose Variational Heteroscedastic Volatility Model (VHVM) -- an end-to-end neural network architecture capable of modelling heteroscedastic behaviour in multivariate financial time series. VHVM leverages recent advances in several areas of deep learning, namely sequential modelling and representation learning, to model complex temporal dynamics between different asset returns. At its core, VHVM consists of a variational autoencoder to capture relationships between assets, and a recurrent neural network to model the time-evolution of these dependencies. The outputs of VHVM are time-varying conditional volatilities in the form of covariance matrices. We demonstrate the effectiveness of VHVM against existing methods such as Generalised AutoRegressive Conditional Heteroscedasticity (GARCH) and Stochastic Volatility (SV) models on a wide range of multivariate foreign currency (FX) datasets.
LGApr 15
Physics-Informed Neural Networks for Solving Derivative-Constrained PDEsKentaro Hoshisashi, Carolyn E Phelan, Paolo Barucca
Physics-Informed Neural Networks (PINNs) recast PDE solving as an optimisation problem in function space by minimising a residual-based objective, yet many applications require additional derivative-based relations that are just as fundamental as the governing equations. In this paper, we present Derivative-Constrained PINNs (DC-PINNs), a general framework that treats constrained PDE solving as an optimisation guided by a minimum objective function criterion where the physics resides in the minimum principle. DC-PINNs embed general nonlinear constraints on states and derivatives, e.g., bounds, monotonicity, convexity, incompressibility, computed efficiently via automatic differentiation, and they employ self-adaptive loss balancing to tune the influence of each objective, reducing reliance on manual hyperparameters and problem-specific architectures. DC-PINNs consistently reduce constraint violations and improve physical fidelity versus baseline PINN variants, representative hard-constraint formulations on benchmarks, including heat diffusion with bounds, financial volatilities with arbitrage-free, and fluid flow with vortices shed. Explicitly encoding derivative constraints stabilises training and steers optimisation toward physically admissible minima even when the PDE residual alone is small, providing reliable solutions of constrained PDEs grounded in energy minimum principles.
LGDec 19, 2024
Granger Causality Detection with Kolmogorov-Arnold NetworksHongyu Lin, Mohan Ren, Paolo Barucca et al.
Discovering causal relationships in time series data is central in many scientific areas, ranging from economics to climate science. Granger causality is a powerful tool for causality detection. However, its original formulation is limited by its linear form and only recently nonlinear machine-learning generalizations have been introduced. This study contributes to the definition of neural Granger causality models by investigating the application of Kolmogorov-Arnold networks (KANs) in Granger causality detection and comparing their capabilities against multilayer perceptrons (MLP). In this work, we develop a framework called Granger Causality KAN (GC-KAN) along with a tailored training approach designed specifically for Granger causality detection. We test this framework on both Vector Autoregressive (VAR) models and chaotic Lorenz-96 systems, analysing the ability of KANs to sparsify input features by identifying Granger causal relationships, providing a concise yet accurate model for Granger causality detection. Our findings show the potential of KANs to outperform MLPs in discerning interpretable Granger causal relationships, particularly for the ability of identifying sparse Granger causality patterns in high-dimensional settings, and more generally, the potential of AI in causality discovery for the dynamical laws in physical systems.
LGFeb 23, 2022
Deep Recurrent Modelling of Granger Causality with Latent ConfoundingZexuan Yin, Paolo Barucca
Inferring causal relationships in observational time series data is an important task when interventions cannot be performed. Granger causality is a popular framework to infer potential causal mechanisms between different time series. The original definition of Granger causality is restricted to linear processes and leads to spurious conclusions in the presence of a latent confounder. In this work, we harness the expressive power of recurrent neural networks and propose a deep learning-based approach to model non-linear Granger causality by directly accounting for latent confounders. Our approach leverages multiple recurrent neural networks to parameterise predictive distributions and we propose the novel use of a dual-decoder setup to conduct the Granger tests. We demonstrate the model performance on non-linear stochastic time series for which the latent confounder influences the cause and effect with different time lags; results show the effectiveness of our model compared to existing benchmarks.
LGFeb 23, 2022
Neural Generalised AutoRegressive Conditional HeteroskedasticityZexuan Yin, Paolo Barucca
We propose Neural GARCH, a class of methods to model conditional heteroskedasticity in financial time series. Neural GARCH is a neural network adaptation of the GARCH 1,1 model in the univariate case, and the diagonal BEKK 1,1 model in the multivariate case. We allow the coefficients of a GARCH model to be time varying in order to reflect the constantly changing dynamics of financial markets. The time varying coefficients are parameterised by a recurrent neural network that is trained with stochastic gradient variational Bayes. We propose two variants of our model, one with normal innovations and the other with Students t innovations. We test our models on a wide range of univariate and multivariate financial time series, and we find that the Neural Students t model consistently outperforms the others.
LGJan 12, 2022
The Recurrent Reinforcement Learning Crypto AgentGabriel Borrageiro, Nick Firoozye, Paolo Barucca
We demonstrate a novel application of online transfer learning for a digital assets trading agent. This agent uses a powerful feature space representation in the form of an echo state network, the output of which is made available to a direct, recurrent reinforcement learning agent. The agent learns to trade the XBTUSD (Bitcoin versus US Dollars) perpetual swap derivatives contract on BitMEX on an intraday basis. By learning from the multiple sources of impact on the quadratic risk-adjusted utility that it seeks to maximise, the agent avoids excessive over-trading, captures a funding profit, and can predict the market's direction. Overall, our crypto agent realises a total return of 350\%, net of transaction costs, over roughly five years, 71\% of which is down to funding profit. The annualised information ratio that it achieves is 1.46.
TROct 10, 2021
Reinforcement Learning for Systematic FX TradingGabriel Borrageiro, Nick Firoozye, Paolo Barucca
We explore online inductive transfer learning, with a feature representation transfer from a radial basis function network formed of Gaussian mixture model hidden processing units to a direct, recurrent reinforcement learning agent. This agent is put to work in an experiment, trading the major spot market currency pairs, where we accurately account for transaction and funding costs. These sources of profit and loss, including the price trends that occur in the currency markets, are made available to the agent via a quadratic utility, who learns to target a position directly. We improve upon earlier work by targeting a risk position in an online transfer learning context. Our agent achieves an annualised portfolio information ratio of 0.52 with a compound return of 9.3\%, net of execution and funding cost, over a 7-year test set; this is despite forcing the model to trade at the close of the trading day at 5 pm EST when trading costs are statistically the most expensive.
LGApr 26, 2021
Stochastic Recurrent Neural Network for Multistep Time Series ForecastingZexuan Yin, Paolo Barucca
Time series forecasting based on deep architectures has been gaining popularity in recent years due to their ability to model complex non-linear temporal dynamics. The recurrent neural network is one such model capable of handling variable-length input and output. In this paper, we leverage recent advances in deep generative models and the concept of state space models to propose a stochastic adaptation of the recurrent neural network for multistep-ahead time series forecasting, which is trained with stochastic gradient variational Bayes. In our model design, the transition function of the recurrent neural network, which determines the evolution of the hidden states, is stochastic rather than deterministic as in a regular recurrent neural network; this is achieved by incorporating a latent random variable into the transition process which captures the stochasticity of the temporal dynamics. Our model preserves the architectural workings of a recurrent neural network for which all relevant information is encapsulated in its hidden states, and this flexibility allows our model to be easily integrated into any deep architecture for sequential modelling. We test our model on a wide range of datasets from finance to healthcare; results show that the stochastic recurrent neural network consistently outperforms its deterministic counterpart.
CEMar 15, 2021
Online Learning with Radial Basis Function NetworksGabriel Borrageiro, Nick Firoozye, Paolo Barucca
Financial time series are characterised by their nonstationarity and autocorrelation. Even if these time series are differenced, technically ensuring their stationarity, they experience regular covariate shifts and concept drifts. Against this backdrop, we combine feature representation transfer with sequential optimisation to provide multi-horizon returns forecasts. Our online learning rbfnet outperforms a random-walk baseline and several powerful batch learners. The rbfnets we formulate are naturally designed to measure the similarity between test samples and continuously updated prototypes that capture the characteristics of the feature space.
SIDec 30, 2017
A dynamic network model with persistent links and node-specific latent variables, with an application to the interbank marketPiero Mazzarisi, Paolo Barucca, Fabrizio Lillo et al.
We propose a dynamic network model where two mechanisms control the probability of a link between two nodes: (i) the existence or absence of this link in the past, and (ii) node-specific latent variables (dynamic fitnesses) describing the propensity of each node to create links. Assuming a Markov dynamics for both mechanisms, we propose an Expectation-Maximization algorithm for model estimation and inference of the latent variables. The estimated parameters and fitnesses can be used to forecast the presence of a link in the future. We apply our methodology to the e-MID interbank network for which the two linkage mechanisms are associated with two different trading behaviors in the process of network formation, namely preferential trading and trading driven by node-specific characteristics. The empirical results allow to recognise preferential lending in the interbank market and indicate how a method that does not account for time-varying network topologies tends to overestimate preferential linkage.
SIJan 20, 2017
Disentangling group and link persistence in Dynamic Stochastic Block modelsPaolo Barucca, Fabrizio Lillo, Piero Mazzarisi et al.
We study the inference of a model of dynamic networks in which both communities and links keep memory of previous network states. By considering maximum likelihood inference from single snapshot observations of the network, we show that link persistence makes the inference of communities harder, decreasing the detectability threshold, while community persistence tends to make it easier. We analytically show that communities inferred from single network snapshot can share a maximum overlap with the underlying communities of a specific previous instant in time. This leads to time-lagged inference: the identification of past communities rather than present ones. Finally we compute the time lag and propose a corrected algorithm, the Lagged Snapshot Dynamic (LSD) algorithm, for community detection in dynamic networks. We analytically and numerically characterize the detectability transitions of such algorithm as a function of the memory parameters of the model and we make a comparison with a full dynamic inference.