LG AI MLAug 9, 2025

Zero-Direction Probing: A Linear-Algebraic Framework for Deep Analysis of Large-Language-Model Drift

arXiv:2508.06776v1

Originality Incremental advance

AI Analysis

This provides a theoretical foundation for monitoring representational changes in large language models, which is incremental as it builds on existing concepts of null spaces and Fisher geometry.

The paper tackles the problem of detecting model drift in large language models by introducing Zero-Direction Probing (ZDP), a theory-only framework that uses null directions of transformer activations without task labels or output evaluations, proving theorems and deriving metrics with non-asymptotic bounds for drift detection.

We present Zero-Direction Probing (ZDP), a theory-only framework for detecting model drift from null directions of transformer activations without task labels or output evaluations. Under assumptions A1--A6, we prove: (i) the Variance--Leak Theorem, (ii) Fisher Null-Conservation, (iii) a Rank--Leak bound for low-rank updates, and (iv) a logarithmic-regret guarantee for online null-space trackers. We derive a Spectral Null-Leakage (SNL) metric with non-asymptotic tail bounds and a concentration inequality, yielding a-priori thresholds for drift under a Gaussian null model. These results show that monitoring right/left null spaces of layer activations and their Fisher geometry provides concrete, testable guarantees on representational change.

View on arXiv PDF

Similar