CE LGFeb 1

The Enhanced Physics-Informed Kolmogorov-Arnold Networks: Applications of Newton's Laws in Financial Deep Reinforcement Learning (RL) Algorithms

Trang Thoi, Hung Tran, Tram Thoi, Huaiyang Zhong

arXiv:2602.01388v11.2

Originality Incremental advance

AI Analysis

This work addresses instability and poor generalization in financial deep reinforcement learning, offering a domain-specific improvement for portfolio optimization in dynamic markets.

The authors tackled portfolio optimization in financial markets by integrating Physics-Informed Kolmogorov-Arnold Networks (PIKANs) into deep reinforcement learning algorithms, resulting in higher cumulative returns, Sharpe ratios, and improved stability across equity markets in China, Vietnam, and the United States.

Deep Reinforcement Learning (DRL), a subset of machine learning focused on sequential decision-making, has emerged as a powerful approach for tackling financial trading problems. In finance, DRL is commonly used either to generate discrete trade signals or to determine continuous portfolio allocations. In this work, we propose a novel reinforcement learning framework for portfolio optimization that incorporates Physics-Informed Kolmogorov-Arnold Networks (PIKANs) into several DRL algorithms. The approach replaces conventional multilayer perceptrons with Kolmogorov-Arnold Networks (KANs) in both actor and critic components-utilizing learnable B-spline univariate functions to achieve parameter-efficient and more interpretable function approximation. During actor updates, we introduce a physics-informed regularization loss that promotes second-order temporal consistency between observed return dynamics and the action-induced portfolio adjustments. The proposed framework is evaluated across three equity markets-China, Vietnam, and the United States, covering both emerging and developed economies. Across all three markets, PIKAN-based agents consistently deliver higher cumulative and annualized returns, superior Sharpe and Calmar ratios, and more favorable drawdown characteristics compared to both standard DRL baselines and classical online portfolio-selection methods. This yields more stable training, higher Sharpe ratios, and superior performance compared to traditional DRL counterparts. The approach is particularly valuable in highly dynamic and noisy financial markets, where conventional DRL often suffers from instability and poor generalization.

View on arXiv PDF

Similar