LG AIMar 18

CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News

Liyuan Chen, Shilong Li, Jiangpeng Yan, Shuoling Liu, Qiang Yang, Xiu Li

arXiv:2603.2230565.91 citationsh-index: 3

AI Analysis

This provides a reproducible benchmark for financial agent research, addressing the gap between static stock-picking benchmarks and live trading, though it's incremental as it adapts existing LLM evaluation to a new financial domain.

The paper tackles the challenge of evaluating LLMs as autonomous financial agents by introducing CN-Buzz2Portfolio, a Chinese-market dataset and benchmark that maps daily trending news to macro and sector asset allocation, revealing significant disparities in how nine LLMs translate narratives into portfolio weights.

Large Language Models (LLMs) are rapidly transitioning from static Natural Language Processing (NLP) tasks including sentiment analysis and event extraction to acting as dynamic decision-making agents in complex financial environments. However, the evolution of LLMs into autonomous financial agents faces a significant dilemma in evaluation paradigms. Direct live trading is irreproducible and prone to outcome bias by confounding luck with skill, whereas existing static benchmarks are often confined to entity-level stock picking and ignore broader market attention. To facilitate the rigorous analysis of these challenges, we introduce CN-Buzz2Portfolio, a reproducible benchmark grounded in the Chinese market that maps daily trending news to macro and sector asset allocation. Spanning a rolling horizon from 2024 to mid-2025, our dataset simulates a realistic public attention stream, requiring agents to distill investment logic from high-exposure narratives instead of pre-filtered entity news. We propose a Tri-Stage CPA Agent Workflow involving Compression, Perception, and Allocation to evaluate LLMs on broad asset classes such as Exchange Traded Funds (ETFs) rather than individual stocks, thereby reducing idiosyncratic volatility. Extensive experiments on nine LLMs reveal significant disparities in how models translate macro-level narratives into portfolio weights. This work provides new insights into the alignment between general reasoning and financial decision-making, and all data, codes, and experiments are released to promote sustainable financial agent research.

View on arXiv PDF

Similar