LGAISep 11, 2025

Meta-Learning Reinforcement Learning for Crypto-Return Prediction

arXiv:2509.09751v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the problem of predicting volatile cryptocurrency returns for traders, but it appears incremental as it builds on existing LLM and RL techniques.

The paper tackles cryptocurrency return prediction by introducing Meta-RL-Crypto, a transformer-based architecture that combines meta-learning and reinforcement learning to create a self-improving trading agent, which outperforms other LLM-based baselines in experiments across diverse market regimes.

Predicting cryptocurrency returns is notoriously difficult: price movements are driven by a fast-shifting blend of on-chain activity, news flow, and social sentiment, while labeled training data are scarce and expensive. In this paper, we present Meta-RL-Crypto, a unified transformer-based architecture that unifies meta-learning and reinforcement learning (RL) to create a fully self-improving trading agent. Starting from a vanilla instruction-tuned LLM, the agent iteratively alternates between three roles-actor, judge, and meta-judge-in a closed-loop architecture. This learning process requires no additional human supervision. It can leverage multimodal market inputs and internal preference feedback. The agent in the system continuously refines both the trading policy and evaluation criteria. Experiments across diverse market regimes demonstrate that Meta-RL-Crypto shows good performance on the technical indicators of the real market and outperforming other LLM-based baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes