CNN-DRL for Scalable Actions in Finance
This addresses a scalability issue for financial trading algorithms, though it appears incremental as it adapts an existing CNN approach to a specific DRL bottleneck.
The paper tackles the problem of deep reinforcement learning agents struggling with large-scale actions in finance, showing that while MLP-based agents fail to adapt when trading volumes increase to thousands of shares, their CNN-based agent remains stable and increases rewards.
The published MLP-based DRL in finance has difficulties in learning the dynamics of the environment when the action scale increases. If the buying and selling increase to one thousand shares, the MLP agent will not be able to effectively adapt to the environment. To address this, we designed a CNN agent that concatenates the data from the last ninety days of the daily feature vector to create the CNN input matrix. Our extensive experiments demonstrate that the MLP-based agent experiences a loss corresponding to the initial environment setup, while our designed CNN remains stable, effectively learns the environment, and leads to an increase in rewards.