4 Papers

57.4LGMay 12
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference

Alireza Nadali, Patrick Cooper, Ashutosh Trivedi et al.

We introduce KV-Fold, a simple, training-free long-context inference protocol that treats the key-value (KV) cache as the accumulator in a left fold over sequence chunks. At each step, the model processes the next chunk conditioned on the accumulated cache, appends the newly produced keys and values, and passes the enlarged cache forward; the same one-step update is applied repeatedly, analogous to foldl in functional programming. Building on the KV cache concatenation primitive introduced for latent multi-agent communication, we repurpose it as a chunk-to-chunk recurrence for long-context inference. When processing chunk t, the model attends to the KV cache carried from earlier chunks as a prefix, reusing its internal state across segments without modifying or retraining the model. Despite its simplicity, the induced recurrence is stable: per-step drift rises briefly and then saturates into a flat plateau that persists across deep chains. This plateau is insensitive to a 10,000x change in numerical precision, robust across chunk sizes, and consistent across model families. At the task level, KV-Fold preserves exact information over long distances. On a needle-in-a-haystack benchmark, it achieves 100% exact-match retrieval across 152 trials spanning contexts from 16K to 128K tokens and chain depths up to 511 on Llama-3.1-8B, while remaining within the memory limits of a single 40GB GPU. Compared to streaming methods, which trade fidelity for bounded memory, KV-Fold maintains long-range retrieval while operating as a sequence of tractable forward passes. Overall, our results show that frozen pretrained transformers already support a stable form of KV-cache recurrence, providing a practical route to long-context inference without architectural changes or training.

80.7HCMar 12
To Believe or Not To Believe: Comparing Supporting Information Tools to Aid Human Judgments of AI Veracity

Jessica Irons, Patrick Cooper, Necva Bolucu et al.

With increasing awareness of the hallucination risks of generative artificial intelligence (AI), we see a growing shift toward providing information tooling to help users determine the veracity of AI-generated answers for themselves. User responsibility for assessing veracity is particularly critical for certain sectors that rely on on-demand, AI-generated data extraction, such as biomedical research and the legal sector. While prior work offers us a variety of ways in which systems can provide such support, there is a lack of empirical evidence on how this information is actually incorporated into the user's decision-making process. Our user study takes a step toward filling this knowledge gap. In the context of a generative AI data extraction tool, we examine the relationship between the type of supporting information (full source text, passage retrieval, and Large Language Model (LLM) explanations) and user behavior in the veracity assessment process, examined through the lens of efficiency, effectiveness, reliance and trust. We find that passage retrieval offers a reasonable compromise between accuracy and speed, with judgments of veracity comparable to using the full source text. LLM explanations, while also enabling rapid assessments, fostered inappropriate reliance and trust on the data extraction AI, such that participants were less likely to detect errors. In additiona, we analyzed the impacts of the complexity of the information need, finding preliminary evidence that inappropriate reliance is worse for complex answers. We demonstrate how, through rigorous user evaluation, we can better develop systems that allow for effective and responsible human agency in veracity assessment processes.

CLFeb 2
Monotonicity as an Architectural Bias for Robust Language Models

Patrick Cooper, Alireza Nadali, Ashutosh Trivedi et al.

Large language models (LLMs) are known to exhibit brittle behavior under adversarial prompts and jailbreak attacks, even after extensive alignment and fine-tuning. This fragility reflects a broader challenge of modern neural language models: small, carefully structured perturbations in high-dimensional input spaces can induce large and unpredictable changes in internal semantic representations and output. We investigate monotonicity as an architectural inductive bias for improving the robustness of Transformer-based language models. Monotonicity constrains semantic transformations so that strengthening information, evidence, or constraints cannot lead to regressions in the corresponding internal representations. Such order-preserving behavior has long been exploited in control and safety-critical systems to simplify reasoning and improve robustness, but has traditionally been viewed as incompatible with the expressivity required by neural language models. We show that this trade-off is not inherent. By enforcing monotonicity selectively in the feed-forward sublayers of sequence-to-sequence Transformers -- while leaving attention mechanisms unconstrained -- we obtain monotone language models that preserve the performance of their pretrained counterparts. This architectural separation allows negation, contradiction, and contextual interactions to be introduced explicitly through attention, while ensuring that subsequent semantic refinement is order-preserving. Empirically, monotonicity substantially improves robustness: adversarial attack success rates drop from approximately 69% to 19%, while standard summarization performance degrades only marginally.

LGFeb 2
Active Causal Experimentalist (ACE): Learning Intervention Strategies via Direct Preference Optimization

Patrick Cooper, Alvaro Velasquez

Discovering causal relationships requires controlled experiments, but experimentalists face a sequential decision problem: each intervention reveals information that should inform what to try next. Traditional approaches such as random sampling, greedy information maximization, and round-robin coverage treat each decision in isolation, unable to learn adaptive strategies from experience. We propose Active Causal Experimentalist (ACE), which learns experimental design as a sequential policy. Our key insight is that while absolute information gains diminish as knowledge accumulates (making value-based RL unstable), relative comparisons between candidate interventions remain meaningful throughout. ACE exploits this via Direct Preference Optimization, learning from pairwise intervention comparisons rather than non-stationary reward magnitudes. Across synthetic benchmarks, physics simulations, and economic data, ACE achieves 70-71% improvement over baselines at equal intervention budgets (p < 0.001, Cohen's d ~ 2). Notably, the learned policy autonomously discovers that collider mechanisms require concentrated interventions on parent variables, a theoretically-grounded strategy that emerges purely from experience. This suggests preference-based learning can recover principled experimental strategies, complementing theory with learned domain adaptation.