PMAICLIRAug 23, 2025

THEME: Enhancing Thematic Investing with Semantic Stock Representations and Temporal Dynamics

arXiv:2508.16936v22 citationsh-index: 8CIKM
Originality Incremental advance
AI Analysis

This work addresses the problem of thematic investing for financial analysts by providing a domain-specific method that is incremental in enhancing existing embedding techniques.

The paper tackles the challenge of thematic investing by developing a framework that fine-tunes embeddings using hierarchical contrastive learning to align themes and stocks, resulting in improved asset retrieval and portfolio performance compared to leading large language models.

Thematic investing, which aims to construct portfolios aligned with structural trends, remains a challenging endeavor due to overlapping sector boundaries and evolving market dynamics. A promising direction is to build semantic representations of investment themes from textual data. However, despite their power, general-purpose LLM embedding models are not well-suited to capture the nuanced characteristics of financial assets, since the semantic representation of investment assets may differ fundamentally from that of general financial text. To address this, we introduce THEME, a framework that fine-tunes embeddings using hierarchical contrastive learning. THEME aligns themes and their constituent stocks using their hierarchical relationship, and subsequently refines these embeddings by incorporating stock returns. This process yields representations effective for retrieving thematically aligned assets with strong return potential. Empirical results demonstrate that THEME excels in two key areas. For thematic asset retrieval, it significantly outperforms leading large language models. Furthermore, its constructed portfolios demonstrate compelling performance. By jointly modeling thematic relationships from text and market dynamics from returns, THEME generates stock embeddings specifically tailored for a wide range of practical investment applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes