Vaibhav Jindal

h-index3
2papers

2 Papers

COMar 23, 2023
Predicting the Initial Conditions of the Universe using a Deterministic Neural Network

Vaibhav Jindal, Albert Liang, Aarti Singh et al.

Finding the initial conditions that led to the current state of the universe is challenging because it involves searching over an intractable input space of initial conditions, along with modeling their evolution via tools such as N-body simulations which are computationally expensive. Recently, deep learning has emerged as a surrogate for N-body simulations by directly learning the mapping between the linear input of an N-body simulation and the final nonlinear output from the simulation, significantly accelerating the forward modeling. However, this still does not reduce the search space for initial conditions. In this work, we pioneer the use of a deterministic convolutional neural network for learning the reverse mapping and show that it accurately recovers the initial linear displacement field over a wide range of scales ($<1$-$2\%$ error up to nearly $k\simeq0.8$-$0.9 \text{ Mpc}^{-1}h$), despite the one-to-many mapping of the inverse problem (due to the divergent backward trajectories at smaller scales). Specifically, we train a V-Net architecture, which outputs the linear displacement of an N-body simulation, given the nonlinear displacement at redshift $z=0$ and the cosmological parameters. The results of our method suggest that a simple deterministic neural network is sufficient for accurately approximating the initial linear states, potentially obviating the need for the more complex and computationally demanding backward modeling methods that were recently proposed.

LGOct 26, 2025
Aligning Diffusion Language Models via Unpaired Preference Optimization

Vaibhav Jindal, Hejian Sang, Chun-Mao Lai et al.

Diffusion language models (dLLMs) are an emerging alternative to autoregressive (AR) generators, but aligning them to human preferences is challenging because sequence log-likelihoods are intractable and pairwise preference data are costly to collect. We introduce ELBO-KTO, which combines an ELBO surrogate for diffusion log-likelihoods with a prospect-theoretic, unpaired preference objective (Kahneman Tversky Optimization, KTO). We analyze the bias and variance induced by the ELBO substitution and employ variance-reduction practices that stabilize gradients during training. Applied to LLaDA-8B-Instruct, ELBO-KTO yields 65.9% and 62.3% adjusted win rates on kto-mix-14k and UltraFeedback-Binary, respectively, versus the base model under an automatic LLM judge. Across downstream tasks, including GSM8K, MMLU, and additional reasoning/knowledge benchmarks, ELBO-KTO trained on UltraFeedback-Binary performs on par with or better than the base model under identical decoding. This establishes unpaired preference optimization as a viable alternative to pairwise alignment in diffusion LLMs.