Semantic State Abstraction Interfaces for LLM-Augmented Portfolio Decisions: Multi-Axis News Decomposition and RL Diagnostics
For researchers in sequential decision systems with sparse text inputs, SSAI provides a diagnostic protocol to separate representation from optimization effects, but the empirical results are incremental and do not demonstrate performance improvements over existing methods.
The paper introduces Semantic State Abstraction Interfaces (SSAI), a framework for mapping sparse news text into four interpretable axes for portfolio decisions. Evaluation shows that while a four-factor portfolio achieves 307.2% cumulative return and Sharpe 1.067, gains are not robust to transaction costs and are statistically fragile compared to simpler baselines.
We introduce Semantic State Abstraction Interfaces (SSAI): a methodological template for mapping sparse unstructured text into $K$ auditable, named coordinates with neutral defaults on no-news days, designed to separate representation hypotheses from optimisation variance in sequential decision systems. Our contribution is the framework and its evaluation protocol, not a claim that SSAI outperforms denser alternatives. We instantiate SSAI with $K=4$ axes (sentiment, risk, confidence, volatility forecast) on a US-equity panel (30 NASDAQ-100 names, FNSPID news, 2019--2023 test), and evaluate it across direct factor portfolios, supervised ridge forecasters, and RL agents (DP-PPO, SAC) that share the same fixed $ϕ$. The four-factor factor portfolio reaches 307.2% cumulative return and Sharpe 1.067, but apparent gains versus buy-and-hold (243.6%) fail coverage-stratified controls, reverse at $\geq 0.2$% costs, and are statistically fragile versus a sentiment-only baseline; a PC1 composite and a FinBERT portfolio baseline are stronger ranking signals in this setting. Ridge and RL blocks diagnose representation versus optimiser effects. We position SSAI as an interpretability-performance diagnostic and reusable protocol for sparse-text decision systems.