Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning
This work addresses portfolio optimization for financial investors, but it is incremental as it adapts an existing continuous-time model to discrete-time with more general assumptions.
The paper tackles the problem of portfolio optimization by developing a discrete-time mean-variance model using reinforcement learning, which shows better applicability to real-world data compared to a continuous-time counterpart, as indicated by simulation and empirical analysis.
This paper studies a discrete-time mean-variance model based on reinforcement learning. Compared with its continuous-time counterpart in \cite{zhou2020mv}, the discrete-time model makes more general assumptions about the asset's return distribution. Using entropy to measure the cost of exploration, we derive the optimal investment strategy, whose density function is also Gaussian type. Additionally, we design the corresponding reinforcement learning algorithm. Both simulation experiments and empirical analysis indicate that our discrete-time model exhibits better applicability when analyzing real-world data than the continuous-time model.