35.6NAMay 4
Hadamard Langevin dynamics for sampling the l1-priorIvan Cheltsov, Federico Cornalba, Clarice Poon et al.
Priors with non-smooth log-densities, such as the l1-prior, are widely used in Bayesian inverse problems for their sparsity-inducing properties. Existing Langevin-based sampling methods typically rely on proximal mappings or smooth approximations, which alter the target distribution. We propose an alternative approach based on a Hadamard product parameterization of the l1-norm, leading to a smooth but nonconvex and non-globally Lipschitz potential whose marginal law exactly recovers the desired posterior. The resulting Hadamard Langevin dynamics (HLD) defines a diffusion process that is analytically distinct from proximal or mirror-type Langevin schemes. Our main contribution is a rigorous well-posedness theory for both the continuous and discrete HLD. We establish existence and uniqueness of strong solutions, geometric ergodicity of the continuous dynamics, and convergence of the discretized scheme as the step size tends to zero. These results provide the first theoretical foundation for sampling from nonconvex, nonsmooth posteriors through overparameterized Langevin dynamics.
LGMar 9, 2022Code
Multi-Objective reward generalization: Improving performance of Deep Reinforcement Learning for applications in single-asset tradingFederico Cornalba, Constantin Disselkamp, Davide Scassola et al.
We investigate the potential of Multi-Objective, Deep Reinforcement Learning for stock and cryptocurrency single-asset trading: in particular, we consider a Multi-Objective algorithm which generalizes the reward functions and discount factor (i.e., these components are not specified a priori, but incorporated in the learning process). Firstly, using several important assets (cryptocurrency pairs BTCUSD, ETHUSDT, XRPUSDT, and stock indexes AAPL, SPY, NIFTY50), we verify the reward generalization property of the proposed Multi-Objective algorithm, and provide preliminary statistical evidence showing increased predictive stability over the corresponding Single-Objective strategy. Secondly, we show that the Multi-Objective algorithm has a clear edge over the corresponding Single-Objective strategy when the reward mechanism is sparse (i.e., when non-null feedback is infrequent over time). Finally, we discuss the generalization properties with respect to the discount factor. The entirety of our code is provided in open source format.