LGSep 18, 2022
The Geometry of Self-supervised Learning Models and its Impact on Transfer LearningRomain Cosentino, Sarath Shekkizhar, Mahdi Soltanolkotabi et al.
Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision due to the inability of supervised models to learn representations that can generalize in domains with limited labels. The recent popularity of SSL has led to the development of several models that make use of diverse training strategies, architectures, and data augmentation policies with no existing unified framework to study or assess their effectiveness in transfer learning. We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each. Unlike existing approaches that consider mathematical approximations of the parameters, individual components, or optimization landscape, our work aims to explore the geometric properties of the representation manifolds learned by SSL models. Our proposed manifold graph metrics (MGMs) provide insights into the geometric similarities and differences between available SSL models, their invariances with respect to specific augmentations, and their performances on transfer learning tasks. Our key findings are two fold: (i) contrary to popular belief, the geometry of SSL models is not tied to its training paradigm (contrastive, non-contrastive, and cluster-based); (ii) we can predict the transfer learning capability for a specific model based on the geometric properties of its semantic and augmentation manifolds.
LGJul 18, 2024
Out-of-Distribution Detection through Soft Clustering with Non-Negative Kernel RegressionAryan Gulati, Xingjian Dong, Carlos Hurtado et al. · allen-ai
As language models become more general purpose, increased attention needs to be paid to detecting out-of-distribution (OOD) instances, i.e., those not belonging to any of the distributions seen during training. Existing methods for detecting OOD data are computationally complex and storage-intensive. We propose a novel soft clustering approach for OOD detection based on non-negative kernel regression. Our approach greatly reduces computational and space complexities (up to 11x improvement in inference time and 87% reduction in storage requirements) and outperforms existing approaches by up to 4 AUROC points on four different benchmarks. We also introduce an entropy-constrained version of our algorithm, which leads to further reductions in storage requirements (up to 97% lower than comparable approaches) while retaining competitive performance. Our soft clustering approach for OOD detection highlights its potential for detecting tail-end phenomena in extreme-scale data settings.
LGOct 31, 2022
Study of Manifold Geometry using Multiscale Non-Negative Kernel GraphsCarlos Hurtado, Sarath Shekkizhar, Javier Ruiz-Hidalgo et al.
Modern machine learning systems are increasingly trained on large amounts of data embedded in high-dimensional spaces. Often this is done without analyzing the structure of the dataset. In this work, we propose a framework to study the geometric structure of the data. We make use of our recently introduced non-negative kernel (NNK) regression graphs to estimate the point density, intrinsic dimension, and the linearity of the data manifold (curvature). We further generalize the graph construction and geometric estimation to multiple scale by iteratively merging neighborhoods in the input data. Our experiments demonstrate the effectiveness of our proposed approach over other baselines in estimating the local geometry of the data manifolds on synthetic and real datasets.
AIJul 2, 2024
Reasoning in Large Language Models: A Geometric PerspectiveRomain Cosentino, Sarath Shekkizhar
The advancement of large language models (LLMs) for real-world applications hinges critically on enhancing their reasoning capabilities. In this work, we explore the reasoning abilities of large language models (LLMs) through their geometrical understanding. We establish a connection between the expressive power of LLMs and the density of their self-attention graphs. Our analysis demonstrates that the density of these graphs defines the intrinsic dimension of the inputs to the MLP blocks. We demonstrate through theoretical analysis and toy examples that a higher intrinsic dimension implies a greater expressive capacity of the LLM. We further provide empirical evidence linking this geometric framework to recent advancements in methods aimed at enhancing the reasoning capabilities of LLMs.
AIDec 4, 2023Code
Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and GenerationRandall Balestriero, Romain Cosentino, Sarath Shekkizhar
Large Language Models (LLMs) drive current AI breakthroughs despite very little being known about their internal representations. In this work, we propose to shed the light on LLMs inner mechanisms through the lens of geometry. In particular, we develop in closed form $(i)$ the intrinsic dimension in which the Multi-Head Attention embeddings are constrained to exist and $(ii)$ the partition and per-region affine mappings of the feedforward (MLP) network of LLMs' layers. Our theoretical findings further enable the design of novel principled solutions applicable to state-of-the-art LLMs. First, we show that, through our geometric understanding, we can bypass LLMs' RLHF protection by controlling the embedding's intrinsic dimension through informed prompt manipulation. Second, we derive interpretable geometrical features that can be extracted from any (pre-trained) LLM, providing a rich abstract representation of their inputs. We observe that these features are sufficient to help solve toxicity detection, and even allow the identification of various types of toxicity. Our results demonstrate how, even in large-scale regimes, exact theoretical results can answer practical questions in LLMs. Code: https://github.com/RandallBalestriero/SplineLLM
AIMay 15
Counterparty Modeling is Not Strategy: The Limits of LLM NegotiatorsRomain Cosentino, Sarath Shekkizhar, Adam Earle et al.
Negotiation requires more than inferring what the other side wants: it requires using that information to make advantageous offers and counteroffers over multiple turns. We study whether large language model (LLM) agents do this in a controlled multi-attribute bargaining environment. We find that current LLM agents can model a counterparty's preferences, but do not reliably turn that knowledge into strategic bargaining. When given negotiating partner preference information, agents model it accurately and early in their reasoning traces, yet this does not reliably improve outcomes for the informed side. Turn-level analyses show why: agents often respond to what they believe the counterparty values, but do not consistently pair those moves with gains on their own high-value attributes. Sellers are more accommodating overall, and in asymmetric-information conditions, the informed side often makes the more weakly compensated concessions. Because agents fail to leverage this underlying utility structure for strategic advantage, their final agreements are heavily dictated by surface-level opening anchors rather than actual utility weights. Finally, requiring agents to explicitly state concession-for-reciprocity trades before making an offer makes individual turns look more strategic, but ultimately fails to improve the efficiency of the final agreements.
AIFeb 23
Interaction Theater: A case of LLM Agents Interacting at ScaleSarath Shekkizhar, Adam Earle
As multi-agent architectures and agent-to-agent protocols proliferate, a fundamental question arises: what actually happens when autonomous LLM agents interact at scale? We study this question empirically using data from Moltbook, an AI-agent-only social platform, with 800K posts, 3.5M comments, and 78K agent profiles. We combine lexical metrics (Jaccard specificity), embedding-based semantic similarity, and LLM-as-judge validation to characterize agent interaction quality. Our findings reveal agents produce diverse, well-formed text that creates the surface appearance of active discussion, but the substance is largely absent. Specifically, while most agents ($67.5\%$) vary their output across contexts, $65\%$ of comments share no distinguishing content vocabulary with the post they appear under, and information gain from additional comments decays rapidly. LLM judge based metrics classify the dominant comment types as spam ($28\%$) and off-topic content ($22\%$). Embedding-based semantic analysis confirms that lexically generic comments are also semantically generic. Agents rarely engage in threaded conversation ($5\%$ of comments), defaulting instead to independent top-level responses. We discuss implications for multi-agent interaction design, arguing that coordination mechanisms must be explicitly designed; without them, even large populations of capable agents produce parallel output rather than productive exchange.
AINov 12, 2025
Echoing: Identity Failures when LLM Agents Talk to Each OtherSarath Shekkizhar, Romain Cosentino, Adam Earle et al.
As large language model (LLM) based agents interact autonomously with one another, a new class of failures emerges that cannot be predicted from single agent performance: behavioral drifts in agent-agent conversations (AxA). Unlike human-agent interactions, where humans ground and steer conversations, AxA lacks such stabilizing signals, making these failures unique. We investigate one such failure, echoing, where agents abandon their assigned roles and instead mirror their conversational partners, undermining their intended objectives. Through experiments across $60$ AxA configurations, $3$ domains, and $2000+$ conversations, we demonstrate that echoing occurs across three major LLM providers, with echoing rates from $5\%$ to $70\%$ depending on the model and domain. Moreover, we find that echoing is persistent even in advanced reasoning models with substantial rates ($32.8\%$) that are not reduced by increased reasoning efforts. We analyze prompt impacts, conversation dynamics, showing that echoing arises as interaction grows longer ($7+$ turns in experiments) and is not merely an artifact of sub-optimal prompting. Finally, we introduce a protocol-level mitigation in which targeted use of structured responses reduces echoing to $9\%$.
MANov 11, 2025
Convergence dynamics of Agent-to-Agent Interactions with Misaligned objectivesRomain Cosentino, Sarath Shekkizhar, Adam Earle
We develop a theoretical framework for agent-to-agent interactions in multi-agent scenarios. We consider the setup in which two language model based agents perform iterative gradient updates toward their respective objectives in-context, using the output of the other agent as input. We characterize the generation dynamics associated with the interaction when the agents have misaligned objectives, and show that this results in a biased equilibrium where neither agent reaches its target - with the residual errors predictable from the objective gap and the geometry induced by the prompt of each agent. We establish the conditions for asymmetric convergence and provide an algorithm that provably achieves an adversarial result, producing one-sided success. Experiments with trained transformer models as well as GPT$5$ for the task of in-context linear regression validate the theory. Our framework presents a setup to study, predict, and defend multi-agent systems; explicitly linking prompt design and interaction setup to stability, bias, and robustness.
AIApr 2
Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language ModelsSarath Shekkizhar, Romain Cosentino, Adam Earle
Standard LLM benchmarks evaluate the assistant turn: the model generates a response to an input, a verifier scores correctness, and the analysis ends. This paradigm leaves unmeasured whether the LLM encodes any awareness of what follows the assistant response. We propose user-turn generation as a probe of this gap: given a conversation context of user query and assistant response, we let a model generate under the user role. If the model's weights encode interaction awareness, the generated user turn will be a grounded follow-up that reacts to the preceding context. Through experiments across $11$ open-weight LLMs (Qwen3.5, gpt-oss, GLM) and $5$ datasets (math reasoning, instruction following, conversation), we show that interaction awareness is decoupled from task accuracy. In particular, within the Qwen3.5 family, GSM8K accuracy scales from $41\%$ ($0.8$B) to $96.8\%$ ($397$B-A$17$B), yet genuine follow-up rates under deterministic generation remain near zero. In contrast, higher temperature sampling reveals interaction awareness is latent with follow up rates reaching $22\%$. Controlled perturbations validate that the proposed probe measures a real property of the model, and collaboration-oriented post-training on Qwen3.5-2B demonstrates an increase in follow-up rates. Our results show that user-turn generation captures a dimension of LLM behavior, interaction awareness, that is unexplored and invisible with current assistant-only benchmarks.
AIApr 21, 2025
AGI Is Coming... Right After AI Learns to Play WordleSarath Shekkizhar, Romain Cosentino
This paper investigates multimodal agents, in particular, OpenAI's Computer-User Agent (CUA), trained to control and complete tasks through a standard computer interface, similar to humans. We evaluated the agent's performance on the New York Times Wordle game to elicit model behaviors and identify shortcomings. Our findings revealed a significant discrepancy in the model's ability to recognize colors correctly depending on the context. The model had a $5.36\%$ success rate over several hundred runs across a week of Wordle. Despite the immense enthusiasm surrounding AI agents and their potential to usher in Artificial General Intelligence (AGI), our findings reinforce the fact that even simple tasks present substantial challenges for today's frontier AI models. We conclude with a discussion of the potential underlying causes, implications for future development, and research directions to improve these AI systems.
LGOct 18, 2021
Channel redundancy and overlap in convolutional neural networks with channel-wise NNK graphsDavid Bonet, Antonio Ortega, Javier Ruiz-Hidalgo et al.
Feature spaces in the deep layers of convolutional neural networks (CNNs) are often very high-dimensional and difficult to interpret. However, convolutional layers consist of multiple channels that are activated by different types of inputs, which suggests that more insights may be gained by studying the channels and how they relate to each other. In this paper, we first analyze theoretically channel-wise non-negative kernel (CW-NNK) regression graphs, which allow us to quantify the overlap between channels and, indirectly, the intrinsic dimension of the data representation manifold. We find that redundancy between channels is significant and varies with the layer depth and the level of regularization during training. Additionally, we observe that there is a correlation between channel overlap in the last convolutional layer and generalization performance. Our experimental results demonstrate that these techniques can lead to a better understanding of deep representations.
LGOct 15, 2021
NNK-Means: Data summarization using dictionary learning with non-negative kernel regressionSarath Shekkizhar, Antonio Ortega
An increasing number of systems are being designed by gathering significant amounts of data and then optimizing the system parameters directly using the obtained data. Often this is done without analyzing the dataset structure. As task complexity, data size, and parameters all increase to millions or even billions, data summarization is becoming a major challenge. In this work, we investigate data summarization via dictionary learning~(DL), leveraging the properties of recently introduced non-negative kernel regression (NNK) graphs. Our proposed NNK-Means, unlike previous DL techniques, such as kSVD, learns geometric dictionaries with atoms that are representative of the input data space. Experiments show that summarization using NNK-Means can provide better class separation compared to linear and kernel versions of kMeans and kSVD. Moreover, NNK-Means is scalable, with runtime complexity similar to that of kMeans.
LGJul 27, 2021
Channel-Wise Early Stopping without a Validation Set via NNK Polytope InterpolationDavid Bonet, Antonio Ortega, Javier Ruiz-Hidalgo et al.
State-of-the-art neural network architectures continue to scale in size and deliver impressive generalization results, although this comes at the expense of limited interpretability. In particular, a key challenge is to determine when to stop training the model, as this has a significant impact on generalization. Convolutional neural networks (ConvNets) comprise high-dimensional feature spaces formed by the aggregation of multiple channels, where analyzing intermediate data representations and the model's evolution can be challenging owing to the curse of dimensionality. We present channel-wise DeepNNK (CW-DeepNNK), a novel channel-wise generalization estimate based on non-negative kernel regression (NNK) graphs with which we perform local polytope interpolation on low-dimensional channels. This method leads to instance-based interpretability of both the learned data representations and the relationship between channels. Motivated by our observations, we use CW-DeepNNK to propose a novel early stopping criterion that (i) does not require a validation set, (ii) is based on a task performance metric, and (iii) allows stopping to be reached at different points for each channel. Our experiments demonstrate that our proposed method has advantages as compared to the standard criterion based on validation set performance.
LGJul 20, 2020
DeepNNK: Explaining deep models and their generalization using polytope interpolationSarath Shekkizhar, Antonio Ortega
Modern machine learning systems based on neural networks have shown great success in learning complex data patterns while being able to make good predictions on unseen data points. However, the limited interpretability of these systems hinders further progress and application to several domains in the real world. This predicament is exemplified by time consuming model selection and the difficulties faced in predictive explainability, especially in the presence of adversarial examples. In this paper, we take a step towards better understanding of neural networks by introducing a local polytope interpolation method. The proposed Deep Non Negative Kernel regression (NNK) interpolation framework is non parametric, theoretically simple and geometrically intuitive. We demonstrate instance based explainability for deep learning models and develop a method to identify models with good generalization properties using leave one out estimation. Finally, we draw a rationalization to adversarial and generative examples which are inevitable from an interpolation view of machine learning.
LGOct 21, 2019
Neighborhood and Graph Constructions using Non-Negative Kernel RegressionSarath Shekkizhar, Antonio Ortega
Data-driven neighborhood definitions and graph constructions are often used in machine learning and signal processing applications. k-nearest neighbor~(kNN) and $ε$-neighborhood methods are among the most common methods used for neighborhood selection, due to their computational simplicity. However, the choice of parameters associated with these methods, such as k and $ε$, is still ad hoc. We make two main contributions in this paper. First, we present an alternative view of neighborhood selection, where we show that neighborhood construction is equivalent to a sparse signal approximation problem. Second, we propose an algorithm, non-negative kernel regression~(NNK), for obtaining neighborhoods that lead to better sparse representation. NNK draws similarities to the orthogonal matching pursuit approach to signal representation and possesses desirable geometric and theoretical properties. Experiments demonstrate (i) the robustness of the NNK algorithm for neighborhood and graph construction, (ii) its ability to adapt the number of neighbors to the data properties, and (iii) its superior performance in local neighborhood and graph-based machine learning tasks.