Scott Viteri

CL
h-index11
3papers
16citations
Novelty68%
AI Score31

3 Papers

CLSep 21, 2024
Uncovering Latent Chain of Thought Vectors in Language Models

Jason Zhang, Scott Viteri

In this work, we examine how targeted perturbations in the activation space of Language Models (LMs) can encode complex reasoning patterns. We inject steering vectors, derived from LM activations, into LMs during inference time and study whether these vectors can induce Chain-of-Thought (CoT) reasoning in LMs without the need for natural language prompting. We demonstrate this approach on Llama3 8B Instruct and Mistral 7B v0.2 Instruct and show that activation-space interventions achieve competitive, if not superior, performance compared to traditional CoT prompting across multiple reasoning benchmarks, including GSM8k, MMLU, AGI Eval, and ARC AI2. These findings suggest that neural network activations can encode reasoning patterns, offering a new application of activation space manipulation as a tool for tuning model behavior.

CLApr 29, 2024
Markovian Transformers for Informative Language Modeling

Scott Viteri, Max Lamparth, Peter Chatain et al. · stanford

Chain-of-Thought (CoT) reasoning often fails to faithfully reflect a language model's underlying decision process. We address this by introducing a Markovian language model framework that can be understood as a reasoning autoencoder: it creates a text-based bottleneck where CoT serves as an intermediate representation, forcing the model to compress essential reasoning into interpretable text before making predictions. We train this system with a GRPO-style policy gradient algorithm using parallel sampling, a frozen baseline CoT', within-batch standardized advantages, and actor-reward (chain-rule) gradients. Our approach yields large gains on QA tasks (e.g., GSM8K: 20.7% to 54.5%; +33.8 pp; ARC-Challenge: 47.5% to 76.9%; +29.4 pp). Perturbation analyses across types and severities show consistently higher sensitivity to CoT edits (typically 52%--82% of cases favor Markovian), indicating stronger causal reliance on the CoT. Cross-model evaluation confirms that learned CoTs generalize across architectures, suggesting they capture transferable reasoning patterns rather than model-specific artifacts.

SCMar 31, 2020
Epistemic Phase Transitions in Mathematical Proofs

Scott Viteri, Simon DeDeo

Mathematical proofs are both paradigms of certainty and some of the most explicitly-justified arguments that we have in the cultural record. Their very explicitness, however, leads to a paradox, because the probability of error grows exponentially as the argument expands. When a mathematician encounters a proof, how does she come to believe it? Here we show that, under a cognitively-plausible belief formation mechanism combining deductive and abductive reasoning, belief in mathematical arguments can undergo what we call an epistemic phase transition: a dramatic and rapidly-propagating jump from uncertainty to near-complete confidence at reasonable levels of claim-to-claim error rates. To show this, we analyze an unusual dataset of forty-eight machine-aided proofs from the formalized reasoning system Coq, including major theorems ranging from ancient to 21st Century mathematics, along with five hand-constructed cases including Euclid, Apollonius, Hernstein's Topics in Algebra, and Andrew Wiles's proof of Fermat's Last Theorem. Our results bear both on recent work in the history and philosophy of mathematics on how we understand proofs, and on a question, basic to cognitive science, of how we justify complex beliefs.