Anuj K. Nayak

LG
h-index34
3papers
10citations
Novelty75%
AI Score48

3 Papers

QUANT-PHMar 18
Iterative Decoding of Stabilizer Codes under Radiation-Induced Correlated Noise

Anuj K. Nayak, Paul G. Baity, Peter J. Love et al.

Fault-tolerant quantum computation demands extremely low logical error rates, yet superconducting qubit arrays are subject to radiation-induced correlated noise arising from cosmic-ray muon-generated quasiparticles. The quasiparticle density is unknown and time-varying, resulting in a mismatch between the true noise statistics and the priors assumed by standard decoders, and consequently, degraded logical performance. We formalize joint noise sensing and decoding using syndrome measurements by modeling the QP density as a latent variable, which governs correlation in physical errors and syndrome measurements. Starting from a variational expectation--maximization approach, we derive an iterative algorithm that alternates between QP density estimation and syndrome-based decoding under the updated noise model. Simulations of surface-code and bivariate bicycle quantum memory under radiation-induced correlated noise demonstrate a measurable reduction in logical error probability relative to baseline decoding with a uniform prior. Beyond improved decoding performance, the inferred QP density provides diagnostic information relevant to device characterization, shielding, and chip design. These results indicate that integrating physical noise estimation into decoding can mitigate correlated noise effects and relax effective error-rate requirements for fault-tolerant quantum computation.

CHEM-PHOct 20, 2025
Foundation Models for Discovery and Exploration in Chemical Space

Alexius Wadell, Anoushka Bhutani, Victor Azumah et al.

Accurate prediction of atomistic, thermodynamic, and kinetic properties from molecular structures underpins materials innovation. Existing computational and experimental approaches lack the scalability required to efficiently navigate chemical space. Scientific foundation models trained on large unlabeled datasets offer a path toward exploring chemical space across diverse application domains. Here we develop MIST, a family of molecular foundation models with up to an order of magnitude more parameters and data than prior works. Trained using a novel tokenization scheme that comprehensively captures nuclear, electronic, and geometric information, MIST learns from a diverse range of molecules. MIST models have been fine-tuned to predict more than 400 structure -- property relationships and match or exceed state-of-the-art performance across benchmarks spanning physiology, electrochemistry, and quantum chemistry. We demonstrate the ability of these models to solve real-world problems across chemical space, including multiobjective electrolyte solvent screening, olfactory perception mapping, isotope half-life prediction, stereochemical reasoning for chiral organometallic compounds, and binary and multi-component mixture property prediction. Probing MIST models using mechanistic interpretability methods reveals identifiable patterns and trends not explicitly present in the training data, suggesting that the models learn generalizable scientific concepts. We formulate hyperparameter-penalized Bayesian neural scaling laws and use them to reduce the computational cost of model development by an order of magnitude. The methods and findings presented here represent a significant step toward accelerating materials discovery, design, and optimization using foundation models and provide valuable guidance for training compute-optimal scientific foundation models.

LGJun 10, 2025
A Theory of Inference Compute Scaling: Reasoning through Directed Stochastic Skill Search

Austin R. Ellis-Mohr, Anuj K. Nayak, Lav R. Varshney

Large language models (LLMs) demand considerable computational, energy, and financial resources during both training and deployment. While scaling laws for training have guided much of the field's recent progress, inference costs now represent a significant and growing component of the overall resource burden, particularly for reasoning-focused models. Existing characterizations of compute-optimality that consider model size, dataset size, and inference tokens in isolation or in fixed combinations risk overlooking more efficient operating points. We introduce directed stochastic skill search (DS3), a general framework that represents inference as stochastic traversal over a learned skill graph. From a simplified yet expressive instantiation, we derive closed-form expressions for task success and compute cost across a wide range of inference strategies -- including chain-of-thought (CoT) and tree-of-thought (ToT) -- enabling comparative analysis as a function of task difficulty and model capability. To that end, we extend a prior first-principles tripartite graph framework of LLM training to incorporate inference, and separately bridge DS3 with empirical methods that characterize LLM scaling behavior. We theoretically recover empirically observed patterns, including: linear accuracy scaling with logarithmic compute; variation in preferred inference strategies as a function of task difficulty and model capability; emergent behavior elicited by reasoning even when performance plateaus under parameter scaling; and both best-of-N (BoN) and majority voting behavior captured within a unified analytical framework. By explicitly characterizing training-inference interdependencies, our framework deepens theoretical understanding and supports principled algorithmic design and resource allocation.