LGMay 22, 2025

Reinforcement Learning for Stock Transactions

arXiv:2505.16099v2
Originality Synthesis-oriented
AI Analysis

This work addresses stock market prediction for traders, but it is incremental as it builds on existing RL techniques without major breakthroughs.

The paper tackled the problem of determining optimal stock transaction timing by applying reinforcement learning (RL) methods, including Q-Learning variants and deep Q-Learning, to a custom Markov Decision Process (MDP) using real-world data, and compared agents to identify the best policy for maximizing profit.

Much research has been done to analyze the stock market. After all, if one can determine a pattern in the chaotic frenzy of transactions, then they could make a hefty profit from capitalizing on these insights. As such, the goal of our project was to apply reinforcement learning (RL) to determine the best time to buy a stock within a given time frame. With only a few adjustments, our model can be extended to identify the best time to sell a stock as well. In order to use the format of free, real-world data to train the model, we define our own Markov Decision Process (MDP) problem. These two papers [5] [6] helped us in formulating the state space and the reward system of our MDP problem. We train a series of agents using Q-Learning, Q-Learning with linear function approximation, and deep Q-Learning. In addition, we try to predict the stock prices using machine learning regression and classification models. We then compare our agents to see if they converge on a policy, and if so, which one learned the best policy to maximize profit on the stock market.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes