CL IT LG LOFeb 24, 2025

Quantifying Logical Consistency in Transformers via Query-Key Alignment

Eduard Tulchinskii, Anastasia Voznyuk, Laida Kushnareva, Andrei Andriiainen, Irina Piontkovskaya, Evgeny Burnaev, Serguei Barannikov

arXiv:2502.17017v14.91 citationsh-index: 36EMNLP

Originality Incremental advance

AI Analysis

This work addresses the problem of assessing logical reasoning in LLMs for researchers and practitioners, offering a scalable evaluation tool that is incremental over existing prompting methods.

The paper tackles the challenge of evaluating logical consistency in large language models by proposing a lightweight method using query-key alignments in transformer attention heads, which reliably separates valid from invalid inferences and shows improved robustness on multiple benchmarks.

Large language models (LLMs) have demonstrated impressive performance in various natural language processing tasks, yet their ability to perform multi-step logical reasoning remains an open challenge. Although Chain-of-Thought prompting has improved logical reasoning by enabling models to generate intermediate steps, it lacks mechanisms to assess the coherence of these logical transitions. In this paper, we propose a novel, lightweight evaluation strategy for logical reasoning that uses query-key alignments inside transformer attention heads. By computing a single forward pass and extracting a "QK-score" from carefully chosen heads, our method reveals latent representations that reliably separate valid from invalid inferences, offering a scalable alternative to traditional ablation-based techniques. We also provide an empirical validation on multiple logical reasoning benchmarks, demonstrating improved robustness of our evaluation method against distractors and increased reasoning depth. The experiments were conducted on a diverse set of models, ranging from 1.5B to 70B parameters.

View on arXiv PDF

Similar