Dirichlet policies for reinforced factor portfolios
This is an incremental approach for finance researchers and practitioners, showing limited practical impact as RL methods did not outperform simple benchmarks.
The paper tackled the problem of combining factor investing with reinforcement learning by using Dirichlet policies for portfolio allocation, and found that the resulting portfolios closely resembled equally-weighted allocations, indicating the agent learned to be factor-agnostic.
This article aims to combine factor investing and reinforcement learning (RL). The agent learns through sequential random allocations which rely on firms' characteristics. Using Dirichlet distributions as the driving policy, we derive closed forms for the policy gradients and analytical properties of the performance measure. This enables the implementation of REINFORCE methods, which we perform on a large dataset of US equities. Across a large range of parametric choices, our result indicates that RL-based portfolios are very close to the equally-weighted (1/N) allocation. This implies that the agent learns to be *agnostic* with regard to factors, which can partly be explained by cross-sectional regressions showing a strong time variation in the relationship between returns and firm characteristics.