Hossein Gorji

LG
h-index2
3papers
3citations
Novelty58%
AI Score38

3 Papers

LGOct 5, 2025
Can Linear Probes Measure LLM Uncertainty?

Ramzi Dakhmouche, Adrien Letellier, Hossein Gorji

Effective Uncertainty Quantification (UQ) represents a key aspect for reliable deployment of Large Language Models (LLMs) in automated decision-making and beyond. Yet, for LLM generation with multiple choice structure, the state-of-the-art in UQ is still dominated by the naive baseline given by the maximum softmax score. To address this shortcoming, we demonstrate that taking a principled approach via Bayesian statistics leads to improved performance despite leveraging the simplest possible model, namely linear regression. More precisely, we propose to train multiple Bayesian linear models, each predicting the output of a layer given the output of the previous one. Based on the obtained layer-level posterior distributions, we infer the global uncertainty level of the LLM by identifying a sparse combination of distributional features, leading to an efficient UQ scheme. Numerical experiments on various LLMs show consistent improvement over state-of-the-art baselines.

LGOct 5, 2025
Why Cannot Neural Networks Master Extrapolation? Insights from Physical Laws

Ramzi Dakhmouche, Hossein Gorji

Motivated by the remarkable success of Foundation Models (FMs) in language modeling, there has been growing interest in developing FMs for time series prediction, given the transformative power such models hold for science and engineering. This culminated in significant success of FMs in short-range forecasting settings. However, extrapolation or long-range forecasting remains elusive for FMs, which struggle to outperform even simple baselines. This contrasts with physical laws which have strong extrapolation properties, and raises the question of the fundamental difference between the structure of neural networks and physical laws. In this work, we identify and formalize a fundamental property characterizing the ability of statistical learning models to predict more accurately outside of their training domain, hence explaining performance deterioration for deep learning models in extrapolation settings. In addition to a theoretical analysis, we present empirical results showcasing the implications of this property on current deep learning architectures. Our results not only clarify the root causes of the extrapolation gap but also suggest directions for designing next-generation forecasting models capable of mastering extrapolation.

AIDec 20, 2024
Collision-based Dynamics for Multi-Marginal Optimal Transport

Mohsen Sadr, Hossein Gorji

Inspired by the Boltzmann kinetics, we propose a collision-based dynamics with a Monte Carlo solution algorithm that approximates the solution of the multi-marginal optimal transport problem via randomized pairwise swapping of sample indices. The computational complexity and memory usage of the proposed method scale linearly with the number of samples, making it highly attractive for high-dimensional settings. In several examples, we demonstrate the efficiency of the proposed method compared to the state-of-the-art methods.