AINov 16, 2025

LOBERT: Generative AI Foundation Model for Limit Order Book Messages

arXiv:2511.12563v1
Originality Incremental advance
AI Analysis

This work addresses the problem of high-frequency financial modeling for traders and researchers, offering a more adaptable and efficient solution, though it is incremental as it builds on existing BERT architecture.

The paper tackles the challenge of modeling Limit Order Book (LOB) dynamics at the message level by introducing LOBERT, a foundation model that adapts BERT for LOB data with a novel tokenization scheme, achieving leading performance in tasks like predicting mid-price movements and next messages while reducing context length.

Modeling the dynamics of financial Limit Order Books (LOB) at the message level is challenging due to irregular event timing, rapid regime shifts, and the reactions of high-frequency traders to visible order flow. Previous LOB models require cumbersome data representations and lack adaptability outside their original tasks, leading us to introduce LOBERT, a general-purpose encoder-only foundation model for LOB data suitable for downstream fine-tuning. LOBERT adapts the original BERT architecture for LOB data by using a novel tokenization scheme that treats complete multi-dimensional messages as single tokens while retaining continuous representations of price, volume, and time. With these methods, LOBERT achieves leading performance in tasks such as predicting mid-price movements and next messages, while reducing the required context length compared to previous methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes