Can Blindfolded LLMs Still Trade? An Anonymization-First Framework for Portfolio Optimization
This addresses trustworthiness in multi-agent trading systems by mitigating memorization and survivorship biases, though it is incremental in focusing on validation within portfolio optimization.
The paper tackled the problem of ensuring LLM trading agents rely on genuine market understanding rather than memorized ticker associations, by introducing an anonymization-first framework called BlindTrade that achieved a Sharpe ratio of 1.40 +/- 0.22 on 2025 YTD data.
For LLM trading agents to be genuinely trustworthy, they must demonstrate understanding of market dynamics rather than exploitation of memorized ticker associations. Building responsible multi-agent systems demands rigorous signal validation: proving that predictions reflect legitimate patterns, not pre-trained recall. We address two sources of spurious performance: memorization bias from ticker-specific pre-training, and survivorship bias from flawed backtesting. Our approach is to blindfold the agents--anonymizing all identifiers--and verify whether meaningful signals persist. BlindTrade anonymizes tickers and company names, and four LLM agents output scores along with reasoning. We construct a GNN graph from reasoning embeddings and trade using PPO-DSR policy. On 2025 YTD (through 2025-08-01), we achieved Sharpe 1.40 +/- 0.22 across 20 seeds and validated signal legitimacy through negative control experiments. To assess robustness beyond a single OOS window, we additionally evaluate an extended period (2024--2025), revealing market-regime dependency: the policy excels in volatile conditions but shows reduced alpha in trending bull markets.