Sean Lim

2papers

2 Papers

33.2LGMay 13
PALMS: A Computational Implementation for Pavlovian Associative Learning Models' Simulation

Martin Fixman, Alessandro Abati, Julián Jiménez Nimmo et al.

In contrast to static formalisms, computational definitions describe the operational mechanisms of a model. Simulations are an essential part of the cycle of theory development and refinement, assisting researchers in formulating the precise definitions that models require, and making accurate predictions. This manuscript introduces a computational implementation of Pavlovian learning models in a Python environment, termed Pavlovian Associative Learning Models' Simulation (PALMS). In addition to the canonical Rescorla-Wagner model, attentional approaches are implemented, including Pearce-Kaye-Hall, Mackintosh Extended, Le Pelley's Hybrid, and a novel extension of the Rescorla-Wagner model featuring a unified variable learning rate that synthesises Mackintosh's and Pearce and Hall's opposing conceptualisations. To our knowledge, only the first attentional model has been previously specified computationally in a general design tool. PALMS integrates a graphical interface that permits the input of entire experimental designs in an alphanumeric format, akin to that used by experimental neuroscientists. It uniquely enables the simulation of experiments involving hundreds of stimuli, such as those used with human participants, and the computation of configural cues and configural-cue compounds across all models, thereby substantially broadening their predictive capabilities. A comprehensive description of the models' implementation is provided in the paper. We evaluate PALMS by simulating five published experiments in the associative learning literature that assessed the predictive scope of existing models, and we show that this implementation provides neuroscientists with a useful tool for identifying critical variables, refining experimental designs, making precise predictions, comparing model fitness, and formulating new theoretical approaches.

51.4AIMar 17
Me, Myself, and $π$ : Evaluating and Explaining LLM Introspection

Atharv Naphade, Samarth Bhargav, Sean Lim et al.

A hallmark of human intelligence is Introspection-the ability to assess and reason about one's own cognitive processes. Introspection has emerged as a promising but contested capability in large language models (LLMs). However, current evaluations often fail to distinguish genuine meta-cognition from the mere application of general world knowledge or text-based self-simulation. In this work, we propose a principled taxonomy that formalizes introspection as the latent computation of specific operators over a model's policy and parameters. To isolate the components of generalized introspection, we present Introspect-Bench, a multifaceted evaluation suite designed for rigorous capability testing. Our results show that frontier models exhibit privileged access to their own policies, outperforming peer models in predicting their own behavior. Furthermore, we provide causal, mechanistic evidence explaining both how LLMs learn to introspect without explicit training, and how the mechanism of introspection emerges via attention diffusion.