Shourya Aggarwal

4.4AIJan 12

Stochastic CHAOS: Why Deterministic Inference Kills, and Distributional Variability Is the Heartbeat of Artifical Cognition

Tanmay Joshi, Shourya Aggarwal, Anusa Saha et al.

Deterministic inference is a comforting ideal in classical software: the same program on the same input should always produce the same output. As large language models move into real-world deployment, this ideal has been imported wholesale into inference stacks. Recent work from the Thinking Machines Lab has presented a detailed analysis of nondeterminism in LLM inference, showing how batch-invariant kernels and deterministic attention can enforce bitwise-identical outputs, positioning deterministic inference as a prerequisite for reproducibility and enterprise reliability. In this paper, we take the opposite stance. We argue that, for LLMs, deterministic inference kills. It kills the ability to model uncertainty, suppresses emergent abilities, collapses reasoning into a single brittle path, and weakens safety alignment by hiding tail risks. LLMs implement conditional distributions over outputs, not fixed functions. Collapsing these distributions to a single canonical completion may appear reassuring, but it systematically conceals properties central to artificial cognition. We instead advocate Stochastic CHAOS, treating distributional variability as a signal to be measured and controlled. Empirically, we show that deterministic inference is systematically misleading. Single-sample deterministic evaluation underestimates both capability and fragility, masking failure probability under paraphrases and noise. Phase-like transitions associated with emergent abilities disappear under greedy decoding. Multi-path reasoning degrades when forced onto deterministic backbones, reducing accuracy and diagnostic insight. Finally, deterministic evaluation underestimates safety risk by hiding rare but dangerous behaviors that appear only under multi-sample evaluation.

2.9CRAug 18, 2020

Password Guessers Under a Microscope: An In-Depth Analysis to Inform Deployments

Zach Parish, Connor Cushing, Shourya Aggarwal et al.

Password guessers are instrumental for assessing the strength of passwords. Despite their diversity and abundance, little is known about how different guessers compare to each other. We perform in-depth analyses and comparisons of the guessing abilities and behavior of password guessers. To extend analyses beyond number of passwords cracked, we devise an analytical framework to compare the types of passwords that guessers generate under various conditions (e.g., limited training data, limited number of guesses, and dissimilar training and target data). Our results show that guessers often produce dissimilar guesses, even when trained on the same data. We leverage this result to show that combinations of computationally-cheap guessers are as effective as computationally intensive guessers, but more efficient. Our insights allow us to provide a concrete set of recommendations for system administrators when performing password checking.

Shourya Aggarwal

2 Papers