ST LG CPSep 25, 2024

MCI-GRU: Stock Prediction Model Based on Multi-Head Cross-Attention and Improved GRU

Peng Zhu, Yuante Li, Yifan Hu, Sheng Xiang, Qinyuan Liu, Dawei Cheng, Yuqi Liang

arXiv:2410.20679v35.919 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses stock prediction for financial analysts and fund managers, but it appears incremental as it builds on existing GRU and attention mechanisms.

The paper tackled the problem of accurate stock prediction in complex financial markets by proposing MCI-GRU, a model that integrates multi-head cross-attention and an improved GRU, and it outperformed state-of-the-art techniques across multiple metrics in experiments on four stock markets.

As financial markets grow increasingly complex in the big data era, accurate stock prediction has become more critical. Traditional time series models, such as GRUs, have been widely used but often struggle to capture the intricate nonlinear dynamics of markets, particularly in the flexible selection and effective utilization of key historical information. Recently, methods like Graph Neural Networks and Reinforcement Learning have shown promise in stock prediction but require high data quality and quantity, and they tend to exhibit instability when dealing with data sparsity and noise. Moreover, the training and inference processes for these models are typically complex and computationally expensive, limiting their broad deployment in practical applications. Existing approaches also generally struggle to capture unobservable latent market states effectively, such as market sentiment and expectations, microstructural factors, and participant behavior patterns, leading to an inadequate understanding of market dynamics and subsequently impact prediction accuracy. To address these challenges, this paper proposes a stock prediction model, MCI-GRU, based on a multi-head cross-attention mechanism and an improved GRU. First, we enhance the GRU model by replacing the reset gate with an attention mechanism, thereby increasing the model's flexibility in selecting and utilizing historical information. Second, we design a multi-head cross-attention mechanism for learning unobservable latent market state representations, which are further enriched through interactions with both temporal features and cross-sectional features. Finally, extensive experiments on four main stock markets show that the proposed method outperforms SOTA techniques across multiple metrics. Additionally, its successful application in real-world fund management operations confirms its effectiveness and practicality.

View on arXiv PDF

Similar