LG CGMar 19

GeoLAN: Geometric Learning of Latent Explanatory Directions in Large Language Models

arXiv:2603.1946036.5h-index: 4

Predicted impact top 66% in LG · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses interpretability issues in LLMs for researchers and practitioners, though it appears incremental as it builds on existing geometric concepts.

The authors tackled the lack of transparency in large language models by introducing GeoLAN, a training framework that treats token representations as geometric trajectories with stickiness conditions, resulting in maintained task accuracy while improving geometric metrics and reducing fairness biases, particularly in mid-sized models.

Large language models (LLMs) demonstrate strong performance, but they often lack transparency. We introduce GeoLAN, a training framework that treats token representations as geometric trajectories and applies stickiness conditions inspired by recent developments related to the Kakeya Conjecture. We have developed two differentiable regularizers, Katz-Tao Convex Wolff (KT-CW) and Katz-Tao Attention (KT-Attn), that promote isotropy and encourage diverse attention. Our experiments with Gemma-3 (1B, 4B, 12B) and Llama-3-8B show that GeoLAN frequently maintains task accuracy while improving geometric metrics and reducing certain fairness biases. These benefits are most significant in mid-sized models. Our findings reveal scale-dependent trade-offs between geometric precision and performance, suggesting that geometry-aware training is a promising approach to enhance mechanistic interpretability.

View on arXiv PDF

Similar