NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference
This addresses the issue of information loss in LLM inference for ambiguous language, though it is incremental as it builds on existing Non-Resolution Reasoning concepts.
The paper tackles the problem of LLMs prematurely committing to a single interpretation of ambiguous input by proposing a text-to-state mapping framework that preserves multiple interpretations, achieving a mean state entropy of 1.087 bits compared to 0 for baselines and 0% collapse in validated cases.
Large language models exhibit a systematic tendency toward early semantic commitment: given ambiguous input, they collapse multiple valid interpretations into a single response before sufficient context is available. This premature collapse discards information that may prove essential as dialogue evolves. We present a formal framework for text-to-state mapping (phi: T -> S) that transforms natural language into a non-collapsing state space where multiple interpretations coexist. The mapping decomposes into three stages: conflict detection, interpretation extraction, and state construction. We instantiate phi with a hybrid extraction pipeline that combines rule-based segmentation for explicit conflict markers with LLM-based enumeration of implicit ambiguity. On a test set of 68 ambiguous sentences, the resulting states preserve interpretive multiplicity: hybrid extraction yields mean state entropy H = 1.087 bits across ambiguity categories, compared to H = 0 for collapse-based baselines that commit to a single interpretation. We also instantiate the rule-based conflict detector for Japanese markers to illustrate cross-lingual portability. This framework extends Non-Resolution Reasoning (NRR) by providing the algorithmic bridge between text and the NRR state space, enabling architectural collapse deferment in LLM inference. Design principles for state-to-state transformations are detailed in the Appendix, with empirical validation on 580 test cases demonstrating 0% collapse for principle-satisfying operators versus up to 17.8% for violating operators.