Hanqing Jin

LGDec 17, 2025

Adaptive Partitioning and Learning for Stochastic Control of Diffusion Processes

Hanqing Jin, Renyuan Xu, Yanzhao Yang

We study reinforcement learning for controlled diffusion processes with unbounded continuous state spaces, bounded continuous actions, and polynomially growing rewards: settings that arise naturally in finance, economics, and operations research. To overcome the challenges of continuous and high-dimensional domains, we introduce a model-based algorithm that adaptively partitions the joint state-action space. The algorithm maintains estimators of drift, volatility, and rewards within each partition, refining the discretization whenever estimation bias exceeds statistical confidence. This adaptive scheme balances exploration and approximation, enabling efficient learning in unbounded domains. Our analysis establishes regret bounds that depend on the problem horizon, state dimension, reward growth order, and a newly defined notion of zooming dimension tailored to unbounded diffusion processes. The bounds recover existing results for bounded settings as a special case, while extending theoretical guarantees to a broader class of diffusion-type problems. Finally, we validate the effectiveness of our approach through numerical experiments, including applications to high-dimensional problems such as multi-asset mean-variance portfolio selection.

PMSep 27, 2007

A Convex Stochastic Optimization Problem Arising from Portfolio Selection

Hanqing Jin, Zuo Quan Xu, Xun Yu Zhou

A continuous-time financial portfolio selection model with expected utility maximization typically boils down to solving a (static) convex stochastic optimization problem in terms of the terminal wealth, with a budget constraint. In literature the latter is solved by assuming {\it a priori} that the problem is well-posed (i.e., the supremum value is finite) and a Lagrange multiplier exists (and as a consequence the optimal solution is attainable). In this paper it is first shown, via various counter-examples, neither of these two assumptions needs to hold, and an optimal solution does not necessarily exist. These anomalies in turn have important interpretations in and impacts on the portfolio selection modeling and solutions. Relations among the non-existence of the Lagrange multiplier, the ill-posedness of the problem, and the non-attainability of an optimal solution are then investigated. Finally, explicit and easily verifiable conditions are derived which lead to finding the unique optimal solution.

Hanqing Jin

2 Papers