LGCETRSep 26, 2023

Gray-box Adversarial Attack of Deep Reinforcement Learning-based Trading Agents

arXiv:2309.14615v15 citationsh-index: 23
Originality Incremental advance
AI Analysis

This addresses security vulnerabilities in automated stock trading systems, which is crucial for financial applications, though it is incremental as it adapts existing adversarial attack concepts to a specific domain.

The paper tackles the problem of adversarial attacks on deep reinforcement learning-based trading agents by proposing a gray-box approach that trades in the same market without direct access to the agent, resulting in average reward reductions of 214.17% and profit reductions of up to 139.4% for baseline methods.

In recent years, deep reinforcement learning (Deep RL) has been successfully implemented as a smart agent in many systems such as complex games, self-driving cars, and chat-bots. One of the interesting use cases of Deep RL is its application as an automated stock trading agent. In general, any automated trading agent is prone to manipulations by adversaries in the trading environment. Thus studying their robustness is vital for their success in practice. However, typical mechanism to study RL robustness, which is based on white-box gradient-based adversarial sample generation techniques (like FGSM), is obsolete for this use case, since the models are protected behind secure international exchange APIs, such as NASDAQ. In this research, we demonstrate that a "gray-box" approach for attacking a Deep RL-based trading agent is possible by trading in the same stock market, with no extra access to the trading agent. In our proposed approach, an adversary agent uses a hybrid Deep Neural Network as its policy consisting of Convolutional layers and fully-connected layers. On average, over three simulated trading market configurations, the adversary policy proposed in this research is able to reduce the reward values by 214.17%, which results in reducing the potential profits of the baseline by 139.4%, ensemble method by 93.7%, and an automated trading software developed by our industrial partner by 85.5%, while consuming significantly less budget than the victims (427.77%, 187.16%, and 66.97%, respectively).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes